i've updated the tv show management script. it now checks against a list of prioritized regular expressions one after another until it finds one that works. also, it grabs the name of the episode from thetvdb.com. additionally, post-information-gathering hooks have been added(this allows me to grab the name of the episode as well as sanitize the filename for the filesystem). finally, hooks have been added to fix show names automatically(eg. Brothers and Sisters becomes Brothers & Sisters so Boxee can find the right series from thetvdb).
i also rewrote my regular expression tool in c++. a simple, one-test benchmark yielded:
-for the java version(the command was "time java -jar RegExpEval "(Hello)(World)" "Hello" "test""):
--real 0m0.446s
--user 0m0.107s
--sys 0m0.057s
-for the c++ version(the command was "time GetRegExp "(Hello)(World)" "Hello" "test"")
--real 0m0.065s
--user 0m0.002s
--sys 0m0.007s
it's not exactly scientific but the c++ is a lot faster than the java version
i also wrote a script to remove old episodes. by default, the script keeps the latest 5 episodes in any series. it removes old seasons as well. it's possible to exclude a show, season, or specific episode from being deleted.
might i also mention this is all done in bash shell script?
due to the way i've written the hooks, there are major security problems(ie execute any code a malicious hacker wants) because the hooks are included into the shell script and then a function is executed. same goes for regular expressions. they are included and then tested followed by a processing function to allow the regexp to fix various things.
anyways, i'll post the code soon because i'm too lazy right now.