epershand: "Python: programming the way Guido indented it" (python)
You know, I kvetch about the AO3 a lot, and their coding team has been doing a lot of hustling lately without getting a lot of love but damn. Sometimes I am just hit by how fucking RIGHT they've done something.

For example: right now I'm helping out [personal profile] starlady with a fandom studies project, by writing her a script that looks at fanfiction html and extracts fandom, ship, publication date, etc. I'm writing dedicated parsers for a few major fic archives.

This is (roughly speaking) what my code looks like for the AO3:

def GetAo3Metadata(self):
"""Extract metadata from Archive of Our Own Beautiful Soup object."""
self.metadata.author = # Find the "a" tag with the class "login author"
self.metadata.title = # Find the "h2" tag with the class title heading"
self.metadata.rating = # Get all items from the list with the class "rating tags"
etc.


This is, roughly speaking, what the code looks like for everything else:

def ParseFanfictionNetMetadata(self):
"""Extract metadata from Fanfiction.net Beautiful Soup object."""
# Find the block called "gui_table1" because, you know, that's meaningful.
# Fuck it, just extract all the text from that block.
# And then do a regular expression search.
# And then take a shot.


Or like this:

def ParseYuletideTreaureMetadata(self):
"""Extract metadata from Yuletidetreasure.org Beautiful Soup object."""
# Fuck is this the nineties? Are there really NO DIVS in this code?
# Or class attributes?
# Or even fucking paragraph blocks?
# Fuck it, I'm drinking.

Syndicate

RSS Atom

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 12th, 2025 08:31 am
Powered by Dreamwidth Studios