{"id":74,"date":"2008-09-24T10:47:21","date_gmt":"2008-09-24T15:47:21","guid":{"rendered":"http:\/\/www.webadminblog.com\/?p=74"},"modified":"2008-09-24T10:47:21","modified_gmt":"2008-09-24T15:47:21","slug":"owasp-google-hacking-project-owasp-appsec-nyc-2008","status":"publish","type":"post","link":"https:\/\/www.webadminblog.com\/index.php\/2008\/09\/24\/owasp-google-hacking-project-owasp-appsec-nyc-2008\/","title":{"rendered":"OWASP Google Hacking Project &#8211; OWASP AppSec NYC 2008"},"content":{"rendered":"<p>This presentation is by Christian Heinrich, the project leader for the OWASP &#8220;Google Hacking&#8221; project.\u00a0 Presentation published on http:\/\/www.slideshare.net\/cmlh\u00a0 Dual licensed under OWASP License and AU Creative Commons 2.5.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>OWASP Testing Guide v3 &#8211; Spiders\/Robots\/Crawlers<\/strong><\/span><\/p>\n<p>1. Automatically traverses hyperlinks<\/p>\n<p>2. Recursively retrieves content referenced<\/p>\n<p>Behavior governed by the robots exclusion protocol.\u00a0 New method is &lt;META NAME=&#8221;Googlebot&#8221; CONTENT=&#8221;nofollow&#8221;&gt;\u00a0 Not supported by all Robots\/Spiders\/Crawlers.\u00a0 Traditional method is robots.txt located in web root directory.\u00a0 Regular expressions supported by minority only.\u00a0 &#8220;User-agent: *&#8221; applies to all spiders\/robots\/crawlers or you can specify a specific robot name.\u00a0 Can be intentionally ignored.\u00a0 Not for httpd access control or digital rights management.<\/p>\n<p>Testing &#8211; Robots Exclusion Protocol<\/p>\n<ol>\n<li>Sign into Google Webmaster Tools<\/li>\n<li>On the dashboard, click the URL<\/li>\n<li>Click &#8220;Tools&#8221;<\/li>\n<li>Click &#8220;Analyze robots.txt&#8221;<\/li>\n<\/ol>\n<p><span style=\"text-decoration: underline;\"><strong>Search Engine Discovery<\/strong><\/span><\/p>\n<p>Microsoft Remote Desktop Web Connection: intitle:Remote.Desktop.Web.Connection inurl: tsweb<\/p>\n<p>VNC: &#8220;VNC Desktop&#8221; inurl:5800<\/p>\n<p>Outlook Web Access: inurl:&#8221;exchange\/logon.asp&#8221;<\/p>\n<p>Outlook Web Access: intitle:&#8221;Microsoft Outlook Web Access &#8211; Logon&#8221;<\/p>\n<p>Adobe Acrobat PDF: filetype:pdf<\/p>\n<p>Google caught onto this and is now displaying a &#8220;We&#8217;re sorry&#8221; message with certain searches.\u00a0 To get around, use different search queries that returns overlapping results.<\/p>\n<p>Google Advanced Search Operators: &#8220;site:&#8221; and &#8220;cache:&#8221;\u00a0 Two ways of using &#8220;site:&#8221;.\u00a0 EIther as &#8220;site:www.google.com&#8221; where you get that specific subdomain&#8217;s results or &#8220;site:google.com&#8221; where you get all hostnames and subdomains. Use &#8220;cache:www.owasp.org&#8221; to display an indexed web page in the google cache.\u00a0 There is also a site operator labeled &#8220;Cached&#8221; which will do the same thing.<\/p>\n<p>You can get updates of the latest relevant Google results (web, news, etc) using Google Alerts.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Download Indexed Cache<\/strong><\/span><\/p>\n<p>Google SOAP Search API.\u00a0 Query limited to either 10 words or 2048 bytes.\u00a0 One thousand search queries per day and limited to search results within 0-999.\u00a0 Up to 10K possible results from 10 different search queries.<\/p>\n<p>$Google_SOA_Search_API -&gt; doGoogleSearch( $key, $q, $start, $maxResults, $filter, $restricts, $safeSearch, $lr, $ie, $oe );<\/p>\n<p>See presentation for response.<\/p>\n<p>Proof of concept tool is &#8220;dic.pl&#8221; or &#8220;Download Indexed Cache&#8221; that downloads the search results.\u00a0 Licensed under the Apache License 2.0.\u00a0 Tool produces a URL and cachedSize response.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>OWASP Google Hacking Project<\/strong><\/span><\/p>\n<p>Tools built using Perl using CPAN Modules SOAP::Lite, Net::Google, and Perl::Critic.\u00a0 Development environmetn is based on Eclipse with EPIC Plug-in.\u00a0 Subversion repository is at code.google.com.<\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Roadmap<\/strong><\/span><\/p>\n<p>Upcoming presentations at ToorCon X in San Diego, SecTor 2008 in Toronto, Canada, and RUXCON 2K8 in Sydney, Australia.<\/p>\n<p>&#8220;TCP Input Text&#8221; Proof of Concept<\/p>\n<p>&#8220;Speak English&#8221; Google Translate Workaround<\/p>\n<p>Refactor and 3rd Project review of PoC Perl Code with public release at RUXCON 2K8 in November 2008.<\/p>\n<p>Check in at code.google.com after RUXCON 2K8<\/p>\n<p>4 hr &#8220;half day&#8221; training course Q1 2009<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This presentation is by Christian Heinrich, the project leader for the OWASP &#8220;Google Hacking&#8221; project.\u00a0 Presentation published on http:\/\/www.slideshare.net\/cmlh\u00a0 Dual licensed under OWASP License and AU Creative Commons 2.5. OWASP Testing Guide v3 &#8211; Spiders\/Robots\/Crawlers 1. Automatically traverses hyperlinks 2. Recursively retrieves content referenced Behavior governed by the robots exclusion protocol.\u00a0 New method is &lt;META [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[127],"tags":[76,626,100,132,12,133,102],"class_list":["post-74","post","type-post","status-publish","format-standard","hentry","category-owasp-appsec-nyc-2008","tag-application","tag-conferences","tag-google","tag-hacking","tag-owasp","tag-project","tag-web"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pfI0c-1c","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/posts\/74","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=74"}],"version-history":[{"count":4,"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/posts\/74\/revisions"}],"predecessor-version":[{"id":87,"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/posts\/74\/revisions\/87"}],"wp:attachment":[{"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=74"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=74"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.webadminblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=74"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}