epershand: "Python: programming the way Guido indented it" (python)
epershand ([personal profile] epershand) wrote2012-06-13 11:48 am

A brief love note to the AO3

You know, I kvetch about the AO3 a lot, and their coding team has been doing a lot of hustling lately without getting a lot of love but damn. Sometimes I am just hit by how fucking RIGHT they've done something.

For example: right now I'm helping out [personal profile] starlady with a fandom studies project, by writing her a script that looks at fanfiction html and extracts fandom, ship, publication date, etc. I'm writing dedicated parsers for a few major fic archives.

This is (roughly speaking) what my code looks like for the AO3:

def GetAo3Metadata(self):
"""Extract metadata from Archive of Our Own Beautiful Soup object."""
self.metadata.author = # Find the "a" tag with the class "login author"
self.metadata.title = # Find the "h2" tag with the class title heading"
self.metadata.rating = # Get all items from the list with the class "rating tags"

This is, roughly speaking, what the code looks like for everything else:

def ParseFanfictionNetMetadata(self):
"""Extract metadata from Fanfiction.net Beautiful Soup object."""
# Find the block called "gui_table1" because, you know, that's meaningful.
# Fuck it, just extract all the text from that block.
# And then do a regular expression search.
# And then take a shot.

Or like this:

def ParseYuletideTreaureMetadata(self):
"""Extract metadata from Yuletidetreasure.org Beautiful Soup object."""
# Fuck is this the nineties? Are there really NO DIVS in this code?
# Or class attributes?
# Or even fucking paragraph blocks?
# Fuck it, I'm drinking.
starlady: the AO3 cake is not a lie (cake is (not) a lie)

[personal profile] starlady 2012-06-14 05:32 am (UTC)(link)
Ahahaha TRUTH.
krait: a sea snake (krait) swimming (Default)

[personal profile] krait 2012-06-14 06:41 am (UTC)(link)