Learning to Use the JSON Data for Research

We've developed some resources for helping you get started using the JSON data in our repository. The first is a series of blog posts on the Verizon Security Blog on how to pull the data into MongoDB and build a detailed report on a specific industry. At the end of the series, you have a script that can be easily modified to look at the industry of your choice. In the repository, there is also a similar script without the industry filter in place, so you can generate an overall report of all of the data in the dataset.

Healthcare Data Breaches: Using VCDB and Mongo to Find Answers

Part 1:

The VERIS Community Database (VCDB) project was launched last year with the goal of making a freely downloadable, raw dataset of publicly disclosed data breach incidents available for research. With the Q2 release, the dataset has grown to over 3,500 incidents, each in its own JSON file. This article will walk the beginner through the process of loading the data into MongoDB and running queries on one industry: Healthcare. (read full article)

Part 2:

Back in July, we looked at working with the VERIS Community Database (VCDB) data to see some basic information about security incidents in the Healthcare industry. Since that time, we’ve completed another update to the dataset, so there are more incidents for us to explore. To begin, you’ll want to drop the existing data from your database and import the new dataset to ensure you have no duplications. I didn’t cover dropping a database in the prior article, so lets go over that now. (read full article)

Part 3:

In this final installment of the series, we will develop our script to provide even more detail on the incidents in the healthcare sector that the VERIS Community Database (VCDB) contains. If you missed Part One or Part Two, it is highly recommended that you review them. They walk you through building the first and second scripts, which this current article expands upon. Unless you are already familiar with the Mongo Aggregation Framework, you should begin there. (read full article)