Super-quick 10-Q Breakdown

I did some Perl hacking (and I mean hacking of the ugly kind, not the "just another Perl hacker" kind). Of the 58,798 10-Qs I have on hand, 42,601 have a "risk factors" section.

SEC disclosure text mining (minor) project update

I've been having fun dealing with the joys of unstructured text processing. The ambiguity in the previous sentence is deliberate: I mean both the joys of processing unstructured text, as well as the joy of unstructured processing of text.

Autogenerating BibTeX citations from ISBN

Looks like it can be done via WorldCat's xisbn API. My approach is something like this: foo:~ $ curl ',ed,title,author,publisher,city' which returns an XML block like the following: <?xml version="1.0" encoding="UTF-8"?> <rsp xmlns="" stat="ok"> <isbn year="1986" ed="11th printing." title="Evolution and …

