Sitemap Scrape Statistics

Two of seven charts rendered following plot.ly's simple instructions for getting started. page

We collect various counts while scraping sitemaps and report them as a text file. Here we plot the most recent counts available.

We fetch and parse the counts.txt file line by line. Each line is its own json record. github

We render each field as a separate time series using the recently open-sourced plotly.js following the advice provided in their quick-start documentation. plotly

The horizontal axis is in days. The scraper runs every six hours. Our first sample, sample number zero, was recorded on Sep 5, 2015 at 22:30 GMT.

Now with 6-hour growth rates for items and links. github

Now with x-axis dates that snap to weeks. github

Now with runtime range from 20-80 minutes, tolerating timezone and daylight shifts. github