Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search
 

ehrnst

(32,640 posts)
Tue Feb 14, 2017, 10:30 AM Feb 2017

Diehard Coders Just Rescued NASAs Earth Science Data




ON SATURDAY MORNING, the white stone buildings on UC Berkeley’s campus radiated with unfiltered sunshine. The sky was blue, the campanile was chiming. But instead of enjoying the beautiful day, 200 adults had willingly sardined themselves into a fluorescent-lit room in the bowels of Doe Library to rescue federal climate data.

Like similar groups across the country—in more than 20 cities—they believe that the Trump administration might want to disappear this data down a memory hole. So these hackers, scientists, and students are collecting it to save outside government servers.

But now they’re going even further. Groups like DataRefuge and the Environmental Data and Governance Initiative, which organized the Berkeley hackathon to collect data from NASA’s earth sciences programs and the Department of Energy, are doing more than archiving. Diehard coders are building robust systems to monitor ongoing changes to government websites. And they’re keeping track of what’s already been removed—because yes, the pruning has already begun.

Tag It, Bag It
The data collection is methodical, mostly. About half the group immediately sets web crawlers on easily-copied government pages, sending their text to the Internet Archive, a digital library made up of hundreds of billions of snapshots of webpages. They tag more data-intensive projects—pages with lots of links, databases, and interactive graphics—for the other group. Called “baggers,” these coders write custom scripts to scrape complicated data sets from the sprawling, patched-together federal websites.

It’s not easy. “All these systems were written piecemeal over the course of 30 years. There’s no coherent philosophy to providing data on these websites,” says Daniel Roesler, chief technology officer at UtilityAPI and one of the volunteer guides for the Berkeley bagger group.


https://www.wired.com/2017/02/diehard-coders-just-saved-nasas-earth-science-data/
4 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Diehard Coders Just Rescued NASAs Earth Science Data (Original Post) ehrnst Feb 2017 OP
Today's heroes. Glad there are so many dedicated folks. JudyM Feb 2017 #1
huge k & R JHan Feb 2017 #2
That's great, but... Ligyron Feb 2017 #3
Yup, there is a lot of important stuff hidden under the top layer. lagomorph777 Feb 2017 #4

Ligyron

(7,633 posts)
3. That's great, but...
Wed Feb 15, 2017, 11:08 AM
Feb 2017

We need these guys to help with our election(s) theft and updating the DNC too.

Never understood why the Russians and RNC were so good at hacking us when all the smart people are on our side.

lagomorph777

(30,613 posts)
4. Yup, there is a lot of important stuff hidden under the top layer.
Wed Feb 15, 2017, 11:10 AM
Feb 2017

And let's vacuum the RNC servers while we're at it.

Latest Discussions»General Discussion»Diehard Coders Just Rescu...