Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Common Crawl code contest - with fresh crawl of 3.2 billion web pages (commoncrawl.org)
23 points by Aloisius on July 18, 2012 | hide | past | favorite | 5 comments



Where does it say they have 3.2 billion pages of fresh data?


This is Chris from Common Crawl. You are right - we didn't have stats about the latest crawl posted. We're putting them up today ...


Would it be possible to get a torrent with the text only part?


It is the 2012 data release linked from the first paragraph.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: