GuidesBuild an applicationFree Org Chart Builder

Free Org Chart Builder

A guide by
BN
Bik Nandy
CJ
Chase Johnson

Introduction #

In this guide, we'll be quickly transforming CSV data into insightful HR chart visualizations using the beginner-friendly Python Streamlit.

If you'd like to jump right in, you can fork the completed version of this app and run it right away here here:

This app has numerous applications, including visualizing sales and financial data, marketing campaign performance, survey results, healthcare patient information, and much more. In this example, we'll use it to analyze an HR employee report to uncover valuable insights.

Feel free to familiarize yourself with the Streamlit docs before we proceed.

While following along in the code if you ever need further explanation we recommend talking to Replit AI Chat in the workspace.

Setup Streamlit Template #

To begin, fork this template by clicking "Use Template" below:

Download the following CSV

users.csv

Upload this to your Replit files. You should see it under main.py.

Using the Dependencies tab ensure you have the following packages installed.

python

1altair 2folium 3geopy 4pandas 5streamlit 6streamlit-agraph 7streamlit-folium
Dependencies

You can also edit these in your pyproject.toml file. The changes in that file will sync with your dependencies.

Read and View CSV #

The first step is to read the CSV file and display the raw data to the Webview tab. Streamlit makes it very easy to do this:

python

1import streamlit as st 2import pandas as pd 3import altair as alt 4from streamlit_agraph import agraph, Node, Edge, Config 5from streamlit_folium import st_folium 6import folium 7from geopy.geocoders import Nominatim 8from datetime import datetime 9 10# File path of the CSV file 11csv_file_path = "users.csv" 12 13# Read the CSV file into a DataFrame 14data = pd.read_csv(csv_file_path) 15 16st.header("Raw Data") 17st.write(data)

This code uses the popular pandas package (say that 3 times fast) to read the CSV into a DataFrame. We then use Streamlit (in our code, st) to display the raw data. Give it a shot!

Raw Data

You should see the data shown above. Take a look at the different fields included here: Employee,Title,Location,Employee Type,Yearly Salary,Level,Department,Team,Manager,Start Date,End Date .

Before we proceed, we want to setup different pages for all the charts we’ll be taking a look at today. To do that we can simply add a sidebar element with radio buttons:

python

1# Create a sidebar navigation menu 2page = st.sidebar.radio("Select a page", ( 3 "Raw Data", 4 "Level Distribution", 5 "Total Spend", 6 "Org Chart", 7 "Location Map", 8 "Add Employee", 9))

And then right under this:

python

1if page == "Raw Data": 2 # Display the raw data 3 st.header("Raw Data") 4 st.write(data) 5 6elif page == "Level Distribution": 7 st.header("Level Distribution") 8 9elif page == "Total Spend": 10 st.header("Total Spend") 11 12elif page == 'Add Employee': 13 st.header("Add Employee") 14 15elif page == "Org Chart": 16 st.header("Org Chart") 17 18elif page == 'Location Map': 19 st.header("Employee Map")

You should now see a sidebar with pages and headings next to the arrow on the top left:

Side Bar

These are the following sections we’ll cover, all the related code will be constrained to each elif block.

Data Section #

We already have most of this section completed as you can see while viewing the Raw Data tab. Let’s try some calculations to familiarize ourselves with the data.

First let’s calculate the average salary of all employees:

python

1# ... All other code 2if page == "Raw Data": 3 # Calculate the average salary 4 data['Yearly Salary'] = data['Yearly Salary'].astype(float) 5 average_salary = data['Yearly Salary'].mean() 6 7 # Display in a human-readable format 8 st.subheader(f"Average Salary: ${average_salary:,.2f}")

Here, we’re making sure all salaries are typed as a float and then calculating the average. We then display the average using the st.subheader - which is essentially just text. The f before the string signifies that the string is a Python F-string or a “formatted literal string” to allow for variables within. We are also formatting the salary into a money-like format.

Next, let’s try calculating the average tenure:

python

1# ... All other code 2if page == "Raw Data": 3 # ... Salary Code 4 # Calculate the tenure for each employee 5 data['Start Date'] = pd.to_datetime(data['Start Date']) 6 data['End Date'] = pd.to_datetime(data['End Date']) 7 data['Tenure'] = (data['End Date'].fillna(datetime.now()) - 8 data['Start Date']).dt.days / 365.25 9 10 average_tenure = data['Tenure'].mean() 11 st.subheader(f"Average Tenure: {average_tenure:.2f} years")

Here we calculate the tenure by:

  1. Converting the 'Start Date' and 'End Date' columns to datetime format
  2. Calculating the difference between 'End Date' and 'Start Date' in days
  3. Converting the days to years by dividing by 365.25 (accounting for leap years)
  4. Calculating the mean of all tenure values to get the average

Pie Chart - Company Level Distribution #

Next, we’ll deal with our first chart: a basic pie chart showing the distribution of company levels in the organization.

Level Distribution

python

1# ... All other code 2elif page == "Level Distribution": 3 # Calculate counts and percentages 4 level_counts = data['Level'].value_counts().reset_index() 5 level_counts.columns = ['Level', 'Count'] 6 level_counts['Percentage'] = level_counts['Count'] / level_counts['Count'].sum() * 100 7 8 # Create a pie chart with labels for levels and counts 9 chart = alt.Chart(level_counts).mark_arc().encode( 10 theta=alt.Theta(field="Count", type="quantitative", stack=True), 11 color=alt.Color(field="Level", type="nominal", legend=None), 12 tooltip=['Level', 'Count', alt.Tooltip('Percentage', format='.2f')] 13 ).properties(width=400, height=400, title='Level Distribution') 14 15 # Add text for counts and levels 16 text = chart.mark_text(radiusOffset=20, size=16).encode( 17 text=alt.Text('Count:Q', format='.0f'), 18 radius=alt.Radius('Count:Q', scale=alt.Scale(type='sqrt', zero=True, rangeMin=30)), 19 color=alt.value('black') 20 ) 21 22 # Display the chart in Streamlit 23 st.altair_chart(alt.layer(chart, text))

The first step is crucial, focusing on manipulating the level data for all employees.

Here we retrieve the count of each unique level in the Level column of the data DataFrame using data['Level'].value_counts(). We then reset the index of the resulting Series, creating a new DataFrame level_counts with columns for Level and Count. After this, we simply calculate the total percentage of levels.

Next, we setup the pie chart using the altair to create a pie chart using the counts data.

The mark_arc method is used to create the pie slices, and the encode method is used to map the data to the visual properties of the chart:

  • theta is used to map the Count field to the size of the pie slices
  • color is used to map the Level field to the color of the pie slices
  • tooltip is used to display the Percentage values when the user hovers over the slice

After this, we add the text for the counts and levels. The code adds text labels to the pie chart using the mark_text function. The radius encoding is used to position the text labels at a fixed distance from the center of the pie chart, using a square root scale to adjust the spacing.

Finally, we display the chart with altair_chart using 2 layers, 1 for the actual pie and 1 for the text.

Bar Charts - Total Spend (Salaries) #

Next, we’ll go over bar charts to visualize total salary spend across different dimensions of the company. Bar charts are great for comparing quantities across different categories. Let's start with a chart showing total salaries by team:

Total Spend

The code below is all you need to create a bar chart in Streamlit:

python

1elif page == "Total Spend": 2 # Calculate the total salaries by team 3 salary_by_team = data.groupby('Team')['Yearly Salary'].sum().reset_index() 4 5 # Create a bar chart 6 st.subheader("Total Salaries by Team") 7 by_team_chart = alt.Chart(salary_by_team).mark_bar().encode( 8 x='Team', 9 y=alt.Y('Yearly Salary', axis=alt.Axis(format='$,.2f', title='Total Salary')), 10 tooltip=['Team', alt.Tooltip('Yearly Salary', format='$,.2f')], 11 color='Team' 12 ) 13 st.altair_chart(by_team_chart, use_container_width=True)

Here we’re first grouping the data by Team and then aggregated the sum of all the employees yearly summaries. Recall from the bar chart that reset_index() is used to create a custom salary_by_team DataFrame with Team and Yearly Salary columns. The next part is similar to the bar chart except we format the x and y axises to money.

Now, try creating a chart that shows salaries grouped by department, similar to the team-based chart we just made!

Forms - Add Employee #

Before diving into more advanced charts, let's create a Streamlit form for adding a new employee.

First, let’s build the actual form UI:

python

1elif page == 'Add Employee': 2 st.header("Add Employee") 3 4 # Define the dropdown values as arrays 5 employee_types = ["Full-Time", "Part-Time", "Contractor", "Intern"] 6 levels = ["Executive", "L4", "L3", "L2", "L1"] 7 departments = ["Engineering", "Product", "Sales", "HR", "Executive"] 8 teams = [ 9 "Backend", "Frontend", "Strategy", "Execution", "Enterprise", "SMB", 10 "Recruiting", "Operations", "Executive" 11 ] 12 managers = [ 13 "Cynthia Banks", "Andrew Vasquez", "Jordan Robinson", "Lisa Smith", 14 "Dustin Johnston", "Stephanie Bush", "Virginia Brown", 15 "Christine Coleman", "Erin Morrow", "Sean Lynch", "Heather Thompson", 16 "Cheryl Berg", "Debra Caldwell", "" 17 ] 18 19 # Create a form for adding a new user 20 with st.form(key='add_user_form'): 21 employee_name = st.text_input('Employee Name') 22 title = st.text_input('Title') 23 location = st.text_input('Location') 24 employee_type = st.selectbox('Employee Type', employee_types) 25 yearly_salary = st.number_input('Yearly Salary', 26 min_value=0.0, 27 step=0.01, 28 format="%.2f") 29 level = st.selectbox('Level', levels) 30 department = st.selectbox('Department', departments) 31 team = st.selectbox('Team', teams) 32 manager = st.selectbox('Manager', managers) 33 start_date = st.date_input('Start Date') 34 end_date = st.date_input('End Date (optional)', 35 value=None, 36 min_value=start_date) 37 38 # Submit button 39 submit_button = st.form_submit_button(label='Add Employee')

Here we define several constant arrays to be used as dropdown options for the employee type, level, department, team, and manager. Then we do the following to build the form:

  • Inside the st.form() function, we define several input fields using Streamlit functions:
    • st.text_input() for employee name, title, and location.
    • st.selectbox() for employee type, level, department, team, and manager.
    • st.number_input() for yearly salary.
    • st.date_input() for start date and end date (optional).

The st.form_submit_button() function is used to create a submit button for the form.

However, you'll notice that the button doesn't actually perform any action yet. We need to connect it to the CSV file so it can write the new employee data. That’s where this next part comes in (note, put this outside of the with form statement from before):

python

1if submit_button: 2 # Prepare the new user data as a dictionary 3 new_user = { 4 'Employee': [employee_name], 5 'Title': [title], 6 'Location': [location], 7 'Employee Type': [employee_type], 8 'Yearly Salary': [yearly_salary], 9 'Level': [level], 10 'Department': [department], 11 'Team': [team], 12 'Manager': [manager], 13 'Start Date': [start_date.strftime('%Y-%m-%d')], 14 'End Date': [end_date.strftime('%Y-%m-%d') if end_date else ''] 15 } 16 17 # Create a DataFrame from the new user data 18 new_user_df = pd.DataFrame(new_user) 19 20 # Concatenate the new user DataFrame with the existing DataFrame 21 updated_data = pd.concat([data, new_user_df], ignore_index=True) 22 23 # Save the updated DataFrame to the CSV file 24 updated_data.to_csv(csv_file_path, index=False) 25 26 st.success('User added successfully!')

The if statement is checking if the submit_button variable is True, which indicates that the user has clicked the Add Employee button to submit the form. We then create a dictionary new_user that contains the values entered in the form for the new employee. The keys of the dictionary correspond to the column names in the data, and the values are stored as lists (even though there is only one new employee). After that, we create a new Pandas DataFrame new_user_df using the new_user dictionary, concatenate it with the existing data and write to the file. Finally we give the user a little success confirmation message.

Give it a shot! You should now see the user added to the bottom row in the CSV.

Graphs - Company Org Chart #

Back to chart, let’s go over a slightly more complex kind of chart: a graph. A graph is essentially a collection of nodes and edges connecting them. In this case, we'll use a graph to create an organizational chart for our company. This will visually represent the hierarchical structure of the organization, showing the relationships between managers and their direct reports.

Org Chart

Let's take a look at the code to create this org chart:

python

1# ... Previous code here 2elif page == "Org Chart": 3 st.header("Org Chart") 4 5 # Create nodes for each employee 6 nodes = [] 7 for _, row in data.iterrows(): 8 node = Node( 9 id=row['Employee'], 10 label=row['Employee'], 11 title=row['Title'], 12 labelStyle={'fill': 'white'}, 13 size=25, 14 shape='circularImage', 15 image= 16 f"https://ui-avatars.com/api/?name={row['Employee']}&background=random" 17 ) 18 nodes.append(node) 19 20 # Create edges to represent the reporting structure 21 edges = [] 22 for _, row in data.iterrows(): 23 if pd.notna(row['Manager']): 24 edge = Edge(source=row['Manager'], 25 target=row['Employee'], 26 type="STRAIGHT") 27 edges.append(edge) 28 29 # Configure the graph layout 30 config = Config(width=950, 31 height=950, 32 directed=True, 33 nodeHighlightBehavior=True, 34 physics=False, 35 highlightColor="#F7A7A6", 36 collapsible=True, 37 node={ 38 'labelProperty': 'label', 39 'titleProperty': 'title' 40 }, 41 link={ 42 'labelProperty': 'label', 43 'renderLabel': True 44 }, 45 levelSeparation=300, 46 hierarchical=True, 47 rankDir='TB') 48 49 # Create the Agraph visualization 50 agraph(nodes=nodes, edges=edges, config=config)

In this code, we create nodes for each employee in the data:

  • We iterate through each row in the data DataFrame.
  • For each row, we create a Node object with the employee's name, title, and a randomly generated avatar image.
  • We append each Node object to the nodes list.

Next, we create edges to represent the reporting structure:

  • We iterate through the data DataFrame again.
  • For each row with a non-empty 'Manager' column, we create an Edge object. The manager's name serves as the source, and the employee's name as the target.
  • We append each Edge object to the edges list.

Finally, we configure the graph with certain stylistic choices. Try messing around with the settings to see what changes! agraph here is from the user-made Streamlit component streamlit_agraph that helps us visualize graphs very easily. Learn more about Streamlit components here.

Maps - Employee Locations #

The final type of visualization we'll cover is mapping employee locations.

Employee Map

The locations of users right now are only saved as strings, ie “San Francisco” or “Houston”. In order to find the latitude and longitude here, we’ll need to do the following:

python

1# Initialize the geolocator and a dictionary for caching coordinates 2geolocator = Nominatim(user_agent="my-app") 3coordinates_dict = {} 4 5def get_coordinates(location): 6 if location not in coordinates_dict: 7 location_obj = geolocator.geocode(location) 8 if location_obj: 9 coordinates_dict[location] = (location_obj.latitude, 10 location_obj.longitude) 11 return coordinates_dict.get(location)

This function will gives us back the coordinates if it’s not already in the coordinates_dict dictionary. The reason this dictionary exists is to cache the results of geocoding operations. Geocoding can be a slow and rate-limited process, so caching the results helps improve performance by avoiding unnecessary repeated lookups for the same location. This is especially useful when dealing with multiple employees in the same city or region.

Now, let’s check out the full map chart code:

python

1elif page == 'Location Map': 2 st.header("Employee Map") 3 4 # Initialize the geolocator and a dictionary for caching coordinates 5 geolocator = Nominatim(user_agent="my-app") 6 coordinates_dict = {} 7 8 def get_coordinates(location): 9 if location not in coordinates_dict: 10 location_obj = geolocator.geocode(location) 11 if location_obj: 12 coordinates_dict[location] = (location_obj.latitude, 13 location_obj.longitude) 14 return coordinates_dict.get(location) 15 16 # Create a Folium map 17 first_location = data['Location'].iloc[0] 18 map_center = get_coordinates(first_location) 19 m = folium.Map(location=map_center if map_center else [0, 0], zoom_start=4) 20 21 # Add circle markers for each employee 22 for _, row in data.iterrows(): 23 coordinates = get_coordinates(row['Location']) 24 if coordinates: 25 folium.CircleMarker( 26 location=coordinates, 27 radius=3, 28 popup=f"Employee: {row['Employee']}<br>Location: {row['Location']}", 29 color='blue', 30 fill=True, 31 fill_color='blue', 32 fill_opacity=0.7).add_to(m) 33 34 # Display the map using Streamlit 35 st_folium(m, width=700, height=500)

Create a Folium map:

  • Get the location of the first employee from the data DataFrame and use it as the map's initial center.
  • Create a Folium map object m with this initial center and a zoom level of 4.

Add circle markers for each employee:

  • Iterate through each row in the data DataFrame.
  • For each row, we obtain the coordinates for the employee's location using the get_coordinates function.
  • If coordinates are found, we add a Folium CircleMarker to the map, specifying the location, radius, popup (showing employee name and location), color, and opacity.

Display the map in Streamlit:

  • We use the st_folium() function from the Streamlit Folium component intergration to display the map in the Streamlit app, setting the width to 700 pixels and height to 500 pixels.

Deploying Your App #

In order to keep your app running and allow others to visit it 24/7, you'll need to deploy it on a hosted server.

Open a new tab in the Workspace and search for “Deployments” or open the control console by typing ⌘ + K (or Ctrl + K) and type "deploy". You should find a screen like this.

Replit Deployments selection UI

For apps like this that need always need to be up listening to requests, we recommend using a Reserved VM. On the next screen, click Approve and configure build settings most internal bots work fine with the default machine settings but if you need more power later, you can always come back and change these settings later. You can monitor your usage and billing at any time at: replit.com/usage.

On the next screen, you’ll be able to set your primary domain and edit the Secrets that will be in your production deployment. Usually, we keep these settings as they are.

Finally, click Deploy and watch your app go live!

What's Next #

You now know how to use Python and Streamlit to visualize data from a CSV and deploy it as a hosted web app for others to see!

If you’d like to bring this project and similar templates into your team, set some time here with the Replit team for a quick demo of Replit Teams.

Happy coding!