To attain the solution for data storage or integrity, handling files is significant in all programming languages, whether dealing with simple textual files or with some structured file formats like CSV and JSON or bin. These formats are used in various domains for small applications to enterprise-level programs. So, in this article, I will cover the basics of the JSON format and how Python will handle the JSON format by providing the reading and writing modules and some techniques for optimizing the performance of analyzing the JSON file.
Overview of JSON file format
Before understanding the technicality, let's first understand the format of JSON. The JSON acronym JavaScript Object Notation is a file format used for storing and transferring textual data between client and server. This format represents data in the form of key-pair value with comma separated. It also supports the nested data structures by using python objects.
{
"name": "Chris",
"age": 23,
"address": {
"city": "New York",
"country": "America"
},
"friends": {
"name": "Emily",
"hobbies": [ "biking", "music", "gaming" ]
}
}
In this example, we have a root JSON object representing a person. It has several properties like name and age. The address property is a nested object having two properties, city and country. Then, the person object has another object representing friends having property of name and an array of hobbies including biking, music, and gaming as value. Overall, this JSON data represents the simple key-pair value structure of JSON format.
Importance and Common Use Cases
JSON is a lightweight data format used to interchange the data between client and server. It is a simple human-readable format that can be easily understood and parsed by the machine. The best use case of JSON is a web application when you are storing and transmitting data between different web-based applications, such that the client has to fill out the Contact Us form and information is sent for further processing.
The few other use cases of JSON files are:
- Configuration file: Most of the developers used the JSON format to write and store the application data in the configuration file. These configuration files are easy to use because you can write all the settings and parameters of the application in them. Its major use is you don't have to change the code for a distinct environment.
- Web Development: JSON data format is the best fit for storing and retrieving the data between client and server. In web development, JavaScript makes dynamic applications, whereas JSON is derived from JavaScript. So, JavaScript can easily convert the JSON object into a JavaScript object for processing.
- Web API and Cross Platform: JSON format is also used for creating the web API because of its simplicity. These APIs will help to exchange the data between the web server and the client. Also, JSON is a platform-independent format and can ensure interaction with different programming languages.
Introduction to the 'JSON' modules in Python
Python provides us with a built-in module that specifically deals with JSON files. To use this module, we first have to import the JSON library and use it to manipulate the JSON file. Also, it has the specialty that we don't have to install the packages for JSON separately in the machine, but simply import the library.
Working with JSON File
While working on a program, we encountered a situation dealing with JSON data format. You must know about the ways of reading and writing operations in a JSON file. Here, the prior knowledge of file handling in Python comes in handy. So, let's start by importing the JSON module into the program.
1. Reading JSON File: using 'json.load()' function
Using the json.load() function, you read the data which is already available in the file.
import json
file_path = "sample3.json" #provide location of file in the same directory
with open(file_path, 'r') as file: #here, the file open operation is performed with write mode
content = json.load(file) #this will load the stored data from file and saved it into content variable
print(content)
Output
2. Parsing JSON data with error handling
Using json.loads() function, you can parse the JSON string data into a Python object. It is used when you already have the string data obtained from the API and user responses. Also, while using this function, use error handling because if the data is not in the string format, it will raise the exception to JSONDecodeError .
import json
file_path = "sample3.json" #provide location of file in the same directory
try:
with open(file_path, 'r') as file: #here, the file open operation is performed with write mode
json_data = '{"fruit": "Orange", "size": "Medium", "color": "Orange"}'
content = json.loads(json_data)
print(content)
except json.JSONDecodeError as e: #it catches the exception if error occur in decoding
print("Decoding Error")
Output
3. Writing JSON files: Using 'json.dump()' function
Apart from reading operations, you can also perform writing operations in JSON files. By using json.dump() function, you can write all the JSON data into the file.
import json
file_path = "sample3.json" #provide location of file in the same directory
with open(file_path, 'w') as file: #here, the file open operation is performed with write mode
data = {
"fruit": "Apple",
"size": "Large",
"color": "Red"
}
json.dump(data, file) #it dumps the JSON data into the file using JSON dump function
#this is used for reading operation
with open(file_path, 'r') as file: #here, the file open operation is performed with read mode
print(file.read()) #print what we write in the file
Output
4. Serializing Python data structures to JSON
Using json.dumps() function, you can serialize the Python data structure i.e. list, tuples, dictionary, object, and other datatype, to JSON string that you can use for transmission and storage between client and server. It takes one argument, a Python object, and then serializes it into a JSON string. It is not used for writing to the file, but for encoding the Python objects into a JSON string.
import json
data = {
"fruit": "Mango",
"size": "Large",
"color": "Yellow"
}
encode_Data = json.dumps(data) #this will encode the data into string and stores in the variable
print(encode_Data) #it prints the converted JSON string
The JSON data is encoded using UTF-8 which is the default encoding in python JSON module.
Output
Handling Nested JSON Files
JSON files store the data in the string format that has the same syntax as a dictionary object in Python. However, remember that it does not work the same as a dictionary in Python. Sometimes, we get the nested JSON data during transmission and storage, where an object is contained in another object, and we have to deal with it differently. In Python, for dealing with nested JSON files, we have to navigate or iterate through the structure of dictionaries.
1. Understanding nested JSON data
To understand the nested JSON data, let's take an example where a JSON object is stored in another object having the following structure:
{
"name": "Chris",
"age": 23,
"address": {
"city": "New York",
"country": "America"
},
"friends": [
{
"name": "Emily",
"hobbies": [ "biking", "music", "gaming" ]
},
{
"name": "John",
"hobbies": [ "soccer", "gaming" ]
}
]
}
In the above data, you can see JSON object has a list with the name friend, where each friend is acting as a nested object with an array of hobbies.
2. Navigating and Accessing Nested Json Objects
For navigating and accessing the nested object, first, parse the Json string into the Python object, then load the data into the file. After that, you can access and navigate by using the keys and indices for each dictionary and list objects in the data.
import json
data = '''{
"name": "Chris",
"age": 23,
"address": {
"city": "New York",
"country": "America"
},
"friends":
[
{
"name": "Emily",
"hobbies": [ "biking", "music", "gaming" ]
},
{
"name": "John",
"hobbies": [ "soccer", "gaming" ]
}
]
}'''
encode_Data = json.loads(data) #this will parse the data and stores in the variable
#accessing the root key
name = encode_Data["name"]
age = encode_Data["age"]
#accessing the nested keys
city = encode_Data["address"]["city"]
country = encode_Data["address"]["country"]
#accessing the nested data within the array
friend1 = encode_Data["friends"][0]["name"]
friend1_hobby = encode_Data["friends"][0]["hobbies"]
print("Name is:",name)
print("Age is:", age)
print("Country:", country)
print("City:", city)
print("Friend name:",friend1)
print("Friend hobby: ", friend1_hobby)
Output
3. Techniques for flattening JSON data
Flattening is the technique that is used to make the nested structure of Json data into tabular format. This format makes it easy to analyze and manipulate the data. Pythons provided us with a library named “pandas” that gave us a simple function json_normalize() for flattening the nested structure of JSON data into a data frame.
import json
import pandas as pd #import pandas library
data = '''[
{
"name": "Chris",
"age": 23,
"address": {
"city": "New York",
"country": "America"
}
},
{
"name": "Emily",
"age": 25,
"address": {
"city": "New York",
"country": "America"
}
},
{
"name": "John",
"age": 27,
"address": {
"city": "New York",
"country": "America"
}
}
]''' #nested data
encodedata = json.loads(data) #first parse the json string into python object
flaten_data = pd.json_normalize(encodedata) #then, flaten the data into tabular format
print(flaten_data) #print the table
Remember, not all python datatypes converted into JSON i.e. tuples are converted into list.
Output
Performance Optimization and Best Practices
While dealing with Json files, several performance considerations need to be followed to ensure the robustness and maintainability of the program. Here, I am going to discuss some performance and optimizing techniques and the best practices for writing clean and efficient code.
Performance Considerations for working with large JSON
- Memory constraints: Large Json files take a lot of memory while loading into the RAM. So, it's better to check the size of the file and available resources before processing it.
- Streaming Technique: if you are dealing with large Json files, it is preferred to use chunks for processing the file. Large Json files will consume a lot of memory when we load the file in the RAM. So, using the Json streaming technique will load the file in the form of chunks rather than loading the entire file.
- Incremental approach: Python provides us with a library with the name ijson, this library allows you to parse the large Json file iteratively. It's better to use libraries instead of creating chunks in the program.
Techniques for optimizing JSON file operation
- Efficient serialization: while working with Json file parsing use the efficient libraries cjson or ujson. These libraries provide a much more efficient way of serialization and deserialization of Json data instead of standard libraries.
- Usage of Api and generator function: when working on a large Json file, try to use the streaming API or some generator function that reads the Json objects line by line instead of loading them into the memory at once.
- Caching: if you are accessing the same Json file frequently then increase the usage of the cache by reducing the data loading request again and again.
- Processing: if you are working with many files, try to use batch processing instead of working line by line. This helps you to improve the performance of input-output operations.
Q: Is dump() and dumps() work same?
A: No, the dump() function is used to serialize the python object to json string and store it into the file, whereas the dumps() works the same but is not stored in the file
Q: What is the ijson library?
A: The ijson library is used for iteratively parsing the json file. It deals with larger json files.
Wrapping Up
In conclusion, you saw that JSON is a platform-independent format used for interchanging the data between client and server. Python provides us with various powerful built-in tools and libraries to deal with Json files. Whether you are developing a web application or working on web API, understanding the fundamentals of json is significant. It is a versatile format that can work with any programming language. So, mastering the methods mentioned in the article will help you to deal with json files efficiently. I hope this guide is helpful for you. Happy Coding!
Reference
Working with Json