From JSON to Dashboard: Visualizing DuckDB Queries in Streamlit with Plotly

Picture by Editor | ChatGPT

# Introduction

Information is an organization’s most vital useful resource, and insights from information might make the distinction between revenue and failure. Nevertheless, uncooked information is tough to grasp, so we visualize it in dashboards so non-technical individuals can higher navigate it.

Constructing a dashboard will not be simple, particularly when working with JSON information. Fortunately, many Python libraries may be mixed to create a useful device.

On this article, we are going to discover ways to develop a dashboard utilizing Streamlit and Plotly to visualise DuckDB queries on information from a JSON file.

Curious? Let’s get into it.

# Dashboard Growth

Earlier than growing our dashboard, let’s be taught a bit concerning the instruments we are going to use.

First, JSON, or JavaScript Object Notation, is a text-based format for storing and transmitting information utilizing key-value pairs and arrays. It’s a generally used format for APIs and information interchange between programs.

Subsequent, DuckDB is an open-source RDBMS (Relational Database Administration System) designed for analytical workloads. It’s an in-process on-line analytical processing (OLAP) SQL database that runs instantly within the Python course of with out the necessity to handle a separate server. It’s additionally optimized for quick execution, ultimate for information evaluation with giant datasets.

Streamlit is usually used for dashboard growth. It’s an open-source framework for growing interactive information internet purposes utilizing Python. To develop the dashboard, we don’t want to grasp HTML, CSS, or JavaScript.

We can even use pandas, a strong library for information manipulation and evaluation in Python.

Lastly, Plotly is an open-source library for growing interactive graphs and charts. It may be built-in with dashboard growth libraries corresponding to Streamlit.

That’s the essential rationalization of the instruments we are going to use. Let’s begin growing our JSON Dashboard. We are going to use the next construction, so attempt to create it as follows.

JSON_Dashboard/
├── information/
│   └── pattern.json
├── app.py
└── necessities.txt

Subsequent, let’s fill the recordsdata with all of the required data. First, let’s have our JSON instance information just like the one under. You possibly can all the time use your individual information, however right here is an instance you should utilize.

[
  {"id": 1, "category": "Electronics", "region": "North", "sales": 100, "profit": 23.5, "date": "2024-01-15"},
  {"id": 2, "category": "Furniture", "region": "South", "sales": 150, "profit": 45.0, "date": "2024-01-18"},
  {"id": 3, "category": "Electronics", "region": "East", "sales": 70, "profit": 12.3, "date": "2024-01-20"},
  {"id": 4, "category": "Clothing", "region": "West", "sales": 220, "profit": 67.8, "date": "2024-01-25"},
  {"id": 5, "category": "Furniture", "region": "North", "sales": 130, "profit": 38.0, "date": "2024-02-01"},
  {"id": 6, "category": "Clothing", "region": "South", "sales": 180, "profit": 55.2, "date": "2024-02-05"},
  {"id": 7, "category": "Electronics", "region": "West", "sales": 90, "profit": 19.8, "date": "2024-02-10"},
  {"id": 8, "category": "Furniture", "region": "East", "sales": 160, "profit": 47.1, "date": "2024-02-12"},
  {"id": 9, "category": "Clothing", "region": "North", "sales": 200, "profit": 62.5, "date": "2024-02-15"},
  {"id": 10, "category": "Electronics", "region": "South", "sales": 110, "profit": 30.0, "date": "2024-02-20"}
]

Subsequent, we are going to fill the necessities.txt file with the libraries we are going to use for our dashboard growth.

streamlit
duckdb
pandas
plotly

Then, run the next code to put in the required libraries. It’s endorsed to make use of a digital setting when organising the setting.

pip set up -r necessities.txt

As soon as every little thing is prepared, we are going to develop our dashboard. We are going to discover the applying code step-by-step so you’ll be able to observe the logic.

Let’s begin by importing the mandatory libraries for our dashboard.

import streamlit as st
import duckdb
import pandas as pd
import plotly.specific as px

Subsequent, we are going to arrange the connection we have to DuckDB.

@st.cache_resource
def get_conn():
    return duckdb.join()

The code above will cache the DuckDB connection so the Streamlit dashboard doesn’t have to reconnect when the dashboard reruns, which avoids any efficiency lag.

Then, we put together the code to learn the JSON information utilizing the next code.

@st.cache_data
def load_data(path):
    df = pd.read_json(path, convert_dates=["date"])
    return df

Within the code above, we rework the JSON file right into a pandas DataFrame and cache the info so we don’t have to learn it once more when the filter modifications.

After the info loading and connection are prepared, we are going to connect with DuckDB to retailer the JSON information. You possibly can all the time change the info location and desk title.

conn = get_conn()
df_full = load_data("information/pattern.json")
conn.execute("CREATE OR REPLACE TABLE gross sales AS SELECT * FROM df_full")

Within the code above, we register the DataFrame as an SQL desk named gross sales inside DuckDB. The desk might be refreshed from reminiscence on each rerun, as we aren’t organising persistence in a separate script.

That’s all for the backend; let’s put together the Streamlit dashboard. First, let’s put together the dashboard title and the sidebar filters.

st.title("From JSON to Dashboard: DuckDB SQL Visualizer")

st.sidebar.header("Filter Choices")
class = st.sidebar.multiselect("Choose Class:", df_full['category'].distinctive())
area = st.sidebar.multiselect("Choose Area:", df_full['region'].distinctive())
date_range = st.sidebar.date_input("Choose Date Vary:", [df_full['date'].min(), df_full['date'].max()])

The sidebar above will turn out to be a dynamic filter for the loaded information, the place we are able to change the SQL question based mostly on these filters.

We then construct the SQL question based on the filters with the next code.

question = "SELECT * FROM gross sales WHERE TRUE"
if class:
    question += f" AND class IN {tuple(class)}"
if area:
    question += f" AND area IN {tuple(area)}"
question += f" AND date BETWEEN '{date_range[0]}' AND '{date_range[1]}'"

The question above is constructed dynamically based mostly on the consumer’s choice. We begin with a WHERE TRUE situation to simplify appending further filters with AND.

With the question era prepared, we are going to present the question and the ensuing information with the next code.

st.subheader("Generated SQL Question")
st.code(question, language="sql")

df = conn.execute(question).df()
st.subheader(f"Question Outcomes: {len(df)} rows")
st.dataframe(df)

The code above exhibits the SQL question used to retrieve information from DuckDB and converts the end result right into a pandas DataFrame to show the filtered desk.

Lastly, we are going to put together the Plotly visualizations utilizing the filtered information.

if not df.empty:
    col1, col2 = st.columns(2)

    with col1:
        st.markdown("### Scatter Plot: Gross sales vs Revenue by Area")
        scatter_fig = px.scatter(df, x="gross sales", y="revenue", coloration="area", hover_data=["category", "date"])
        st.plotly_chart(scatter_fig, use_container_width=True)

    with col2:
        st.markdown("### Bar Chart: Complete Gross sales by Class")
        bar_fig = px.bar(df.groupby("class", as_index=False)["sales"].sum(), x="class", y="gross sales", text_auto=True)
        st.plotly_chart(bar_fig, use_container_width=True)

    st.markdown("### Line Chart: Day by day Gross sales Pattern")
    line_fig = px.line(df.groupby("date", as_index=False)["sales"].sum(), x="date", y="gross sales")
    st.plotly_chart(line_fig, use_container_width=True)
else:
    st.warning("No information discovered for the chosen filters.")

Within the code above, we create three completely different plots: a scatter plot, a bar chart, and a line chart. You possibly can all the time change the chart sort based on your wants.

With all of the code prepared, we are going to run the next command to launch our Streamlit dashboard.

Now you can entry the dashboard, which seems just like the picture under.

The plots will appear like the picture under.

Because the visualizations use Plotly, you’ll be able to navigate them interactively, as proven within the line chart under.

That’s all you must know. You possibly can all the time add extra complexity to the dashboard and even deploy it in your enterprise.

# Conclusion

Information is probably the most invaluable useful resource an organization can have, and visualizing it in a dashboard is a means for enterprise individuals to realize insights. On this article, we realized the way to develop a easy dashboard with Streamlit and Plotly whereas connecting to information from a JSON file saved in DuckDB.

I hope this has helped!

Cornellius Yudha Wijaya is an information science assistant supervisor and information author. Whereas working full-time at Allianz Indonesia, he likes to share Python and information suggestions by way of social media and writing media. Cornellius writes on quite a lot of AI and machine studying matters.

Main Menu

What's Hot

Rent Gifted Offshore Copywriters In The Philippines

5 Highly effective Python Decorators for Excessive-Efficiency Information Pipelines

U.S. Holds Off on New AI Chip Export Guidelines in Shock Transfer in Tech Export Wars

From JSON to Dashboard: Visualizing DuckDB Queries in Streamlit with Plotly

5 Highly effective Python Decorators for Excessive-Efficiency Information Pipelines

What OpenClaw Reveals In regards to the Subsequent Part of AI Brokers – O’Reilly

mAceReason-Math: A Dataset of Excessive-High quality Multilingual Math Issues Prepared For RLVR

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

Meta resumes AI coaching utilizing EU person knowledge

Rent Gifted Offshore Copywriters In The Philippines

5 Highly effective Python Decorators for Excessive-Efficiency Information Pipelines

U.S. Holds Off on New AI Chip Export Guidelines in Shock Transfer in Tech Export Wars

When You Ought to Not Deploy Brokers

Main Menu

Subscribe to Updates

What's Hot

From JSON to Dashboard: Visualizing DuckDB Queries in Streamlit with Plotly

# Introduction

# Dashboard Growth

# Conclusion

Related Posts