<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dw="https://www.dreamwidth.org">
  <id>tag:dreamwidth.org,2009-05-01:171716</id>
  <title>epershand</title>
  <subtitle>Knowledge, sir! Useful for its own sake, sir!</subtitle>
  <author>
    <name>epershand</name>
  </author>
  <link rel="alternate" type="text/html" href="https://epershand.dreamwidth.org/"/>
  <link rel="self" type="text/xml" href="https://epershand.dreamwidth.org/data/atom"/>
  <updated>2012-06-13T19:01:03Z</updated>
  <dw:journal username="epershand" type="personal"/>
  <entry>
    <id>tag:dreamwidth.org,2009-05-01:171716:73621</id>
    <link rel="alternate" type="text/html" href="https://epershand.dreamwidth.org/73621.html"/>
    <link rel="self" type="text/xml" href="https://epershand.dreamwidth.org/data/atom/?itemid=73621"/>
    <title>A brief love note to the AO3</title>
    <published>2012-06-13T19:00:38Z</published>
    <updated>2012-06-13T19:01:03Z</updated>
    <category term="fandom: archives"/>
    <category term="html pedantry"/>
    <category term="it works bitches"/>
    <category term="fake code"/>
    <category term="archive of our own"/>
    <category term="coding"/>
    <dw:security>public</dw:security>
    <dw:reply-count>2</dw:reply-count>
    <content type="html">You know, I kvetch about the AO3 a lot, and their coding team has been doing a lot of hustling lately without getting a lot of love but damn. Sometimes I am just hit by how fucking RIGHT they've done something.&lt;br /&gt;&lt;br /&gt;For example: right now I'm helping out &lt;span style='white-space: nowrap;'&gt;&lt;a href='https://starlady.dreamwidth.org/profile'&gt;&lt;img src='https://www.dreamwidth.org/img/silk/identity/user.png' alt='[personal profile] ' width='17' height='17' style='vertical-align: text-bottom; border: 0; padding-right: 1px;' /&gt;&lt;/a&gt;&lt;a href='https://starlady.dreamwidth.org/'&gt;&lt;b&gt;starlady&lt;/b&gt;&lt;/a&gt;&lt;/span&gt; with a fandom studies project, by writing her a script that looks at fanfiction html and extracts fandom, ship, publication date, etc. I'm writing dedicated parsers for a few major fic archives.&lt;br /&gt;&lt;br /&gt;This is (roughly speaking) what my code looks like for the AO3:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;def GetAo3Metadata(self):&lt;br /&gt;    """Extract metadata from Archive of Our Own Beautiful Soup object."""&lt;br /&gt;    self.metadata.author = # Find the "a" tag with the class "login author"&lt;br /&gt;    self.metadata.title = # Find the "h2" tag with the class title heading"&lt;br /&gt;    self.metadata.rating = # Get all items from the list with the class "rating tags"&lt;br /&gt;    etc.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;This is, roughly speaking, what the code looks like for everything else:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;def ParseFanfictionNetMetadata(self):&lt;br /&gt;    """Extract metadata from Fanfiction.net Beautiful Soup object."""&lt;br /&gt;    # Find the block called "gui_table1" because, you know, that's meaningful.&lt;br /&gt;    # Fuck it, just extract all the text from that block.&lt;br /&gt;    # And then do a regular expression search.&lt;br /&gt;    # And then take a shot.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Or like this:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;def ParseYuletideTreaureMetadata(self):&lt;br /&gt;   """Extract metadata from Yuletidetreasure.org Beautiful Soup object."""&lt;br /&gt;   # Fuck is this the nineties? Are there really NO DIVS in this code?&lt;br /&gt;   # Or class attributes?&lt;br /&gt;   # Or even fucking paragraph blocks?&lt;br /&gt;   # Fuck it, I'm drinking.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=epershand&amp;ditemid=73621" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
  <entry>
    <id>tag:dreamwidth.org,2009-05-01:171716:57886</id>
    <link rel="alternate" type="text/html" href="https://epershand.dreamwidth.org/57886.html"/>
    <link rel="self" type="text/xml" href="https://epershand.dreamwidth.org/data/atom/?itemid=57886"/>
    <title>Damn You, Mark Pilgrim</title>
    <published>2011-08-11T14:10:28Z</published>
    <updated>2011-08-11T14:12:06Z</updated>
    <category term="coding"/>
    <category term="enthusiasm"/>
    <category term="internets"/>
    <category term="big damn geek sir"/>
    <category term="semantic web"/>
    <category term="books"/>
    <dw:security>public</dw:security>
    <dw:reply-count>1</dw:reply-count>
    <content type="html">All this week, I have been making the same mistake. I look at the clock and think "huh, I should go to bed soon. Maybe I'll just read a chapter of &lt;a href="http://diveintohtml5.org/"&gt;Dive Into HTML5&lt;/a&gt; before I go to bed."&lt;br /&gt;&lt;br /&gt;It is generally about two hours after this that I pull myself away from whatever fascinating and specific Wikipedia or Quora article or Joel on Software blog post or whatever I am currently reading, because &lt;em&gt;Dive Into HTML5&lt;/em&gt; is the TV Tropes of computer manuals.&lt;br /&gt;&lt;br /&gt;Seriously, read the chapter &lt;a href="http://diveintohtml5.org/past.html"&gt; A Quite Biased History of HTML5&lt;/a&gt; and tell me if YOU can drag yourself away from it and its links. Browser wars! Extended quotations of Marc Andreessen's emails! Snarky commentary on the methods of standards bodies!&lt;br /&gt;&lt;br /&gt;This thing is BETTER THAN THE &lt;a href="http://www.amazon.com/Operating-System-Concepts-Abraham-Silberschatz/dp/0470128720"&gt;DINOSAUR OPERATING SYSTEMS TEXTBOOK&lt;/a&gt;. (This is, for the record, the highest praise I can bestow on any book about computers.) But now I've got this fear that it's going to be like it was after that month where I read all the Sarah Vowell books. I went around wanting to tell people Exciting Facts! And the response was always "oh yeah, I think I read something like that in a Sarah Vowell book once." I am totally going to be all "BROWSER WARS!" and people will be like "oh yeah, that was an awesome chapter in &lt;em&gt;Dive Into HTML5&lt;/em&gt;."&lt;br /&gt;&lt;br /&gt;So far, the people on twitter I've enthused at have linked me to:&lt;br /&gt;&lt;a href="http://diveintomark.org/archives/2004/01/14/thought_experiment"&gt;This snarky Pilgrim essay on XML&lt;/a&gt;&lt;br /&gt;&lt;a href="http://www.quora.com/Why-has-Microsoft-failed-to-make-Internet-Explorer-web-standards-compliant-in-spite-of-years-of-browser-market-share-loss"&gt;This commentary on the positive things IE did in the world of browser development&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Oh also the &lt;a href="http://en.wikipedia.org/wiki/Browser_wars"&gt;wikipedia page on BROWSER WARS!&lt;/a&gt; Is amazing. But you already know that because you have read the chapter above, which links to it.&lt;br /&gt;&lt;br /&gt;&lt;img src="https://www.dreamwidth.org/tools/commentcount?user=epershand&amp;ditemid=57886" width="30" height="12" alt="comment count unavailable" style="vertical-align: middle;"/&gt; comments</content>
  </entry>
</feed>
