Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Nevilledog

(51,197 posts)
Wed Jun 2, 2021, 08:55 PM Jun 2021

Tips on how to use the Internet Archive/Wayback Machine in investigations.



Tweet text:
Craig Silverman
@CraigSilverman
A great article from @OsintCurious with tips on how to use the Internet Archive/Wayback Machine in investigations. Lots of stuff I didn't know I could do and search for: https://osintcurio.us/2021/03/03/using-archive-org-for-osint-investigations/… #OSINT
gray steel file cabinet
Using Archive.org for OSINT Investigations
The Internet Archive, commonly known as the Wayback Machine allows users to visit archived versions of websites. The Internet Archive has been archiving sites since 1996 and has 514 billion archive…
osintcurio.us
7:10 AM · Jun 2, 2021


https://osintcurio.us/2021/03/03/using-archive-org-for-osint-investigations/

The Internet Archive, commonly known as the Wayback Machine allows users to visit archived versions of websites. The Internet Archive has been archiving sites since 1996 and has 514 billion archived web pages!

If you are wondering how you can use the Internet Archive in your OSINT research, you’ve come to the right place. There are many methods to extract important information from the Wayback Machine to further your OSINT investigations. If you are looking to see historical versions of a website due to the site being deleted or replaced with new content, the Wayback Machine can help. You may need to verify that a target previously worked at a company but the current state of the site does not have the target’s information there. Sometimes a target may intentionally hide information from their present website, looking at older dates of the site may reveal new information. Sometimes you can gather relevant data like names, phone numbers, email addresses, and even metadata from older versions of a website. Let’s explore search methods…

Quick Search Methods:

The quickest method to see all the files archived on a particular site are by visiting the URL https://web.archive.org/*/www.example.com and replacing http://www.example.com with the site of your interest. Example: https://web.archive.org/web/*/www.osinttechniques.com


If the site has been archived, a calendar view will appear with colour coded dots which have different meanings. The blue dots are what you’ll want to click on as they indicate a capture of the web page. Green indicates a redirect, orange dots indicate the crawler received a client error and red means there was a server error. Navigating the timeline will display the dates of when the site was archived.

*snip*

12 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Tips on how to use the Internet Archive/Wayback Machine in investigations. (Original Post) Nevilledog Jun 2021 OP
Love the wayback machine! soothsayer Jun 2021 #1
I really like The Internet Archive Bristlecone Jun 2021 #2
Open Source Intelligence Nevilledog Jun 2021 #5
He's got it slightly wrong - "The Internet Archive, commonly known as the Wayback Machine" csziggy Jun 2021 #3
;-{) Goonch Jun 2021 #4
Bookmarking for a much longer perusal. Thanks! hlthe2b Jun 2021 #6
Excellent resource. Ms. Toad Jun 2021 #7
My Attorney Used It To Prove Idiot I Sued Was Lying About Dates DanieRains Jun 2021 #11
Moving Image Archive canetoad Jun 2021 #8
I saw a certain presidential candidate who ran an escort service kimbutgar Jun 2021 #9
Bookmarking! Hekate Jun 2021 #10
Also useful for US news sites that block access from Europe muriel_volestrangler Jun 2021 #12

Bristlecone

(10,133 posts)
2. I really like The Internet Archive
Wed Jun 2, 2021, 09:01 PM
Jun 2021

So much stuff in there that I enjoy.

I wonder if it will tell me what OSINT stands for though? I’ll give it a try.

csziggy

(34,137 posts)
3. He's got it slightly wrong - "The Internet Archive, commonly known as the Wayback Machine"
Wed Jun 2, 2021, 09:06 PM
Jun 2021

That is not correct. The Wayback Machine is only one part of the Internet Archive. I mostly use Internet Archive for locating old, out of print books. For instance, they have a marvelous selection of antique needlework books and magazines, with designs and stitch patterns. Also, many of the genealogies that were published during the family research craze of the late 1800s are to be found there.

Their statement on their main page is "Internet Archive is a non-profit library of millions of free books, movies, software, music, websites, and more."

While their immense collection is available at no charge, they would appreciate donations from anyone who finds their site useful. I donate a small amount every month since I use their resources regularly.

Dear Internet Archive Community,

Right now more than ever before, we need your help. 2020 brought unique challenges and unprecedented demand for our services. In the middle of a global pandemic, natural disasters, and political turmoil, we're all turning to our screens for information—today is the Internet's day.

With our staff working remotely and our community relying on us like never before, we’re providing resources to digital learners, entertaining quarantined citizens everywhere, and archiving history as it unfolds. As physical libraries remain closed and the world adjusts to a new normal, we’re offering millions of texts, audio files, webpages, images, and other resources for users around the world. Right now:

We’re hosting 70 petabytes of data and counting
The Wayback Machine is storing more than 475 billion webpages
Readers around the world are browsing more than 28 million books and texts
Music lovers, podcast listeners, Old Time Radio fans, and audiophiles have access to more than 14 million recordings
Users are uploading more than 17,000 items per day

The Internet Archive has always kept our collections completely free for everyone, everywhere. But we don’t charge for access, sell user data, or run ads. Instead, we rely on the generosity of individuals like you to pay for servers, staff, and preservation projects.

A little goes a long way. For $20, we can acquire, digitize, and preserve a book forever. If everyone who uses the archive contributed just $5, we could continue offering these services for free and ad-free for years to come. If you find our site useful, please chip in!

More: https://archive.org/donate/?origin=iawww-TopNavDonateButton

Ms. Toad

(34,087 posts)
7. Excellent resource.
Wed Jun 2, 2021, 10:21 PM
Jun 2021

As a patent attorney, I used archive.org to search for prior art to invalidate comptitor's patents and (to a lesser extent) to see if there might be prior art that might create patenting issues.

In my current occupation - I spent about 3 hours on it yesterday compiling a complete document of all bar exam essay prompts from 1995 to present (previously I had most from 2006 to present).

This has more detailed search guidance than I've encountered before.

 

DanieRains

(4,619 posts)
11. My Attorney Used It To Prove Idiot I Sued Was Lying About Dates
Thu Jun 3, 2021, 12:00 AM
Jun 2021

We got an instant summary judgement on the spot, and the case continued till idiot hired better counsel and settled.

Invaluable.

canetoad

(17,183 posts)
8. Moving Image Archive
Wed Jun 2, 2021, 10:29 PM
Jun 2021

All sorts of interesting stuff in this section of the archive.

Download or listen to free movies, films, and videos
This library contains digital movies uploaded by Archive users which range from classic full-length films, to daily alternative news broadcasts, to cartoons and concerts. Many of these videos are available for free download. Check our FAQ for more information.

https://archive.org/details/movies

muriel_volestrangler

(101,361 posts)
12. Also useful for US news sites that block access from Europe
Thu Jun 3, 2021, 06:18 AM
Jun 2021

Since the EU (including the UK, then) introduced rules about the data websites can keep about their visitors, many US news organisations, rather than do some work to respect the data privacy of their readers, have placed a blanket ban on access from anywhere in Europe.

However, the Wayback Machine works from the USA, so it has free access to the news stories. And the Wayback Machine doesn't try and keep data about you, so it can just serve you up the story without breaking the GDPR rules.

Latest Discussions»General Discussion»Tips on how to use the In...