{"id":570,"date":"2008-11-29T00:13:08","date_gmt":"2008-11-29T04:13:08","guid":{"rendered":"http:\/\/www.dr-chuck.com\/wordpress\/?p=570"},"modified":"2011-12-17T12:28:35","modified_gmt":"2011-12-17T16:28:35","slug":"wow-an-index-really-matters","status":"publish","type":"post","link":"https:\/\/www.dr-chuck.com\/csev-blog\/2008\/11\/wow-an-index-really-matters\/","title":{"rendered":"Wow &#8211; An Index really matters"},"content":{"rendered":"<p>I am doing some spidering using Python and SQLite 3 &#8211; I made a process to allow my spider to be restarted by retrieving and storing spidered material i the database &#8211; to make sure that I could restart this &#8211; I was doing the following select to find a url to make sure I did not re-retrieve:<br \/>\nselect text from revisions where url = ? limit 0,1<br \/>\nBut after there were about 50000 pages &#8211; this started to slow down  a lot.  It was doing a full scan.  So I stopped the process and added an index:<br \/>\ncreate unique index revidx on revisions (url) ;<br \/>\nWow &#8211; it is so much faster.  Nothing like a ton of data to remind you of the speed difference between Order(N) and Order(Log N).<br \/>\nI am liking Python and its built in support for Sqlite3.  Nice.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I am doing some spidering using Python and SQLite 3 &#8211; I made a process to allow my spider to be restarted by retrieving and storing spidered material i the database &#8211; to make sure that I could restart this &#8211; I was doing the following select to find a url to make sure I [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-570","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts\/570","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/comments?post=570"}],"version-history":[{"count":1,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts\/570\/revisions"}],"predecessor-version":[{"id":2683,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts\/570\/revisions\/2683"}],"wp:attachment":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/media?parent=570"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/categories?post=570"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/tags?post=570"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}