Thursday, July 18, 2024
From the WireTechnology

5 Steps to Beautiful Line Charts in Python

In the article “5 Steps to Beautiful Line Charts in Python,” the author explores how to utilize the full capabilities of Matplotlib to create visually appealing and captivating line charts. Building upon their previous article about bar charts, they delve into the specificities of line charts and provide step-by-step guidance to enhance storytelling through data visualization. By following their methodology, readers can transform their line charts from simple representations of data to visually striking and informative graphics. The article also includes examples and a public dataset on countries’ GDP over the past 50 years to illustrate the process. Whether you’re a beginner or an experienced Python user, this article is a valuable resource for creating engaging line charts.

Step 1: Importing the Necessary Packages

To start building beautiful line charts in Python, you need to import the necessary packages. In this case, you will need to import pandas, matplotlib.pyplot, and datetime.timedelta.

Pandas is a powerful data manipulation library that allows you to easily work with data structures like dataframes. Matplotlib.pyplot is a plotting library that provides the tools to create various types of visualizations, including line charts. Lastly, datetime.timedelta is a package that allows you to perform calculations with dates and times. By importing these packages, you will have access to the functions and methods needed to create your line chart.

Here is an example of how to import these packages in Python:

import pandas as pd import matplotlib.pyplot as plt from datetime import timedelta 

Step 2: Reading the Data

After importing the necessary packages, the next step is to read the data that you want to plot on your line chart. In this case, you will be reading a CSV file that contains the GDP information of different countries over the past 50 years.

You can use the pandas library’s read_csv() function to read the CSV file and store the data in a dataframe. A dataframe is a two-dimensional labeled data structure with columns of potentially different types. By storing the data in a dataframe, you can easily manipulate and plot the data.

Here is an example of how to read a CSV file and store the data in a dataframe:

# Read the data df = pd.read_csv('data.csv', sep=',') # Drop unnecessary columns df.drop(['Series Name', 'Series Code', 'Country Code'], axis=1, inplace=True) 

In this example, the read_csv() function reads the data from a file named ‘data.csv’ using the specified separator ‘,’. The drop() function is then used to remove unnecessary columns from the dataframe. The ‘axis=1’ parameter specifies that the columns should be dropped.

Step 3: Filtering the Data

Once you have read the data and stored it in a dataframe, you can filter the data to include only the top 20 countries of 2022. This step allows you to focus on the specific data that you want to visualize on your line chart.

To filter the data, you can create a new dataframe that includes only the rows that meet the desired criteria. In this case, you want to include only the top 20 countries based on the data of 2022. You can use pandas’ filtering capabilities to achieve this.

Here is an example of how to filter the data to include only the top 20 countries:

# Filter on the Top 20 countries of 2022 top_20_countries = df[df['Year'] == 2022].nlargest(20, 'GDP') 

In this example, the df[df['Year'] == 2022] statement filters the dataframe to include only the rows where the ‘Year’ column is equal to 2022. The nlargest(20, 'GDP') function then selects the top 20 rows based on the ‘GDP’ column.

Step 4: Plotting the Line Chart

Now that you have the filtered data, you can proceed to plot the line chart. A line chart is a type of chart that displays information as a series of data points connected by straight line segments.

To plot the line chart, you can use the plotting capabilities of the matplotlib.pyplot library. This library provides a variety of functions and methods to customize the appearance and layout of your chart.

Here are the steps to plot the line chart:

Set the figure size

Before plotting the lines, you can set the figure size to control the dimensions of the chart. This step allows you to specify the width and height of the chart based on your preferences or the requirements of the visual.

# Set the figure size plt.figure(figsize=(10, 6)) 

In this example, the figure(figsize=(10, 6)) statement sets the figure size to 10 inches in width and 6 inches in height.

Plot the lines

To plot the lines on the chart, you can use the plot() function of matplotlib.pyplot. This function takes the x-axis values and y-axis values as inputs and plots the corresponding data points connected by line segments.

# Plot the lines plt.plot(top_20_countries['Year'], top_20_countries['GDP']) 

In this example, the plot(top_20_countries['Year'], top_20_countries['GDP']) statement plots the ‘Year’ column on the x-axis and the ‘GDP’ on the y-axis.

Add a title and labels

To provide context and make the chart more informative, you can add a title and labels to the chart. The title describes the purpose or subject of the chart, while the labels provide names for the x-axis and y-axis.

# Add a title and labels plt.title('GDP Evolution over time of the Top 20 countries') plt.xlabel('Year') plt.ylabel('GDP') 

In this example, the title('GDP Evolution over time of the Top 20 countries') statement adds a title to the chart. The xlabel('Year') and ylabel('GDP') statements add labels to the x-axis and y-axis, respectively.

Customize the appearance

To enhance the appearance of the chart, you can customize various aspects such as the line color, line style, gridlines, and legends. This step allows you to make the chart visually appealing and easy to interpret.

# Customize the appearance plt.grid(True) plt.legend(['Top 20 countries']) 

In this example, the grid(True) statement adds gridlines to the chart. The legend(['Top 20 countries']) statement adds a legend to the chart, labeling the line as ‘Top 20 countries’.

Step 5: Saving and Displaying the Chart

After creating and customizing the line chart, you have the option to save the chart as an image and/or display it on your screen.

To save the chart as an image, you can use the savefig() function of matplotlib.pyplot. This function takes the file name and file format as inputs and saves the chart as an image in the specified format.

# Save the chart as an image plt.savefig('line_chart.png') 

In this example, the savefig('line_chart.png') statement saves the chart as ‘line_chart.png’.

To display the chart on your screen, you can use the show() function of matplotlib.pyplot. This function shows the chart in a separate window.

# Display the chart 

In this example, the show() statement displays the chart on your screen.


Building beautiful line charts in Python can be achieved by following these five steps: importing the necessary packages, reading the data, filtering the data, plotting the line chart, and saving and displaying the chart. By utilizing the capabilities of pandas and matplotlib.pyplot, you can create visually appealing and informative line charts that effectively communicate your data.

Remember to consider the appearance and layout of your chart, including the figure size, title, labels, and customization options. These elements can greatly contribute to the overall visual impact and clarity of your line chart.

By applying these steps and incorporating best practices, you can create compelling line charts that effectively communicate your data and tell a more interesting story.