How to Generate ERDs from a Snowflake Model

Styled ERD for the Chinooks Database with Graphviz

Only a few days ago I presented the proper way to reverse-engineer a Snowflake database schema with a simple Python script, without going at all through INFORMATION_SCHEMA:

  • SHOW TABLES — to get all the tables
  • SHOW COLUMNS — to get all the columns for these tables
  • SHOW PRIMARY KEYS — for all PRIMARY KEY constraints
  • SHOW UNIQUE KEYS — for all UNIQUE constraints
  • SHOW IMPORTED KEYS — for all FOREIGN KEY constraints

Here is another simple Python script that generates ERDs (Entity-Relationship Diagrams) from the loaded metadata in DOT Graphviz notation. The generated code can be pasted on free online GraphViz viewers like viz-js, to instantly render ERDs. Or you may get them previewed automatically in VSCode, through a great plugin. The full open-source code is now on GitHub, free for you to use!

The Entity-Relationship Diagrams

For a free Chinook sample database, we’ll generate three DOT files, as follows:

(a) The -full.dot version will be used to render the table columns with their data types (which displays the most amount of information):

Full ERD Model

(b) The -columns.dot version will be used to render the table columns with no data types (which shows the structure of each table, without going into too much detail):

Simple Table Columns ERD Model

(c) The -relationships.dot version will be used to render just the table names and the relationships between (which is great for complex models, with many tables and relationships):

Relationship-Based ERD Model

The Python Script Generator

To generate these files, our Python script will use the Table instances loaded already into the “tables” dictionary, and call the following sequence:

dumpDotERD(tables, f"samples/{database}.{schema}-relationships.dot")
dumpDotERD(tables, f"samples/{database}.{schema}-full.dot")
dumpDotERD(tables, f"samples/{database}.{schema}-columns.dot")

Each dumpDotERD call will generate a “digraph” with nodes and relationships, as you can already see in the default demo at viz-js. Remark that you can use different colours, styles, and pen sizes from built-in themes, but we will keep it simple here.

def dumpDotERD(tables, filename):
isCollapsed = filename.endswith("-relationships.dot")
isFull = filename.endswith("-full.dot")
s = ('# copy and paste all this to http://viz-js.com/\n\n'
+ 'digraph G {\n'
+ ' graph [ rankdir="LR" bgcolor="#ffffff" ]\n'
+ f' node [ style="filled" shape="Mrecord" gradientangle="180" ]\n'
+ ' edge [ arrowhead="none" arrowtail="none" dir="both" ]\n\n')
# generate shape nodes and links
for name in tables:
s += tables[name].getDotShape(isFull, isCollapsed)
s += "\n"
for name in tables:
s += tables[name].getDotLinks()
s += "}\n"
print(s)
# save all in a DOT file
with open(filename, "w") as file:
file.write(s)

The Table class has been extended with the getDotShape method, to render the tables as shapes:

def getDotShape(self, isFull, isCollapsed):
fillcolor = "#f5f5f5" if not isCollapsed else "#e0e0e0"
colspan = "2" if isFull else "1"
s = (f' {self.label} [\n'
+ f' fillcolor="{fillcolor}" color="#6c6c6c" penwidth="1"\n'
+ f' label=<<table style="rounded" border="0" cellborder="0" cellspacing="0" cellpadding="1">\n'
+ f' <tr><td bgcolor="#e0e0e0" align="center" colspan="{colspan}"><font color="#000000"><b>{self.name}</b></font></td></tr>\n')
if isFull or not isCollapsed:
for column in self.columns:
name = column.name
if column.ispk: name = f"<u>{name}</u>"
if column.fkof != None: name = f"<i>{name}</i>"
if column.nullable: name = f"{name}*"
if column.identity: name = f"{name} I"
if column.isunique: name = f"{name} U"
if isFull:
s += (f' <tr><td align="left"><font color="#000000">{name}&nbsp;</font></td>\n'
+ f' <td align="left"><font color="#000000">{column.datatype}</font></td></tr>\n')
else:
s += f' <tr><td align="left"><font color="#000000">{name}</font></td></tr>\n'
return s + ' </table>>\n ]\n'

The same Table class has been extended with the getDotLinks method, to show the relationships:

def getDotLinks(self):
s = ""
for constraint in self.fks:
fks = self.fks[constraint]
fk1 = fks[0]
dashed = "" if not fk1.nullable else ' style="dashed"'
arrow = "" if fk1.ispk and len(self.pks) == len(fk1.fkof.table.pks) else ' arrowtail="crow"'
s += f' {self.label} -> {fk1.fkof.table.label} [ penwidth="1" color="#696969"{dashed}{arrow} ]\n'
return s

The Graphviz Preview Plugin

The DOT code was generated both on-screen and into separate files. You may copy and paste the generated content into a free Graphviz-based online viewer, as said before.

But you may also get the SVG images directly into your VSCode IDE, using the Graphviz Preview VSCode plugin. Once you select a .dot file and click on the Preview toolbar icon, the file is instantly rendered as SVG in a separate tab:

Graphviz Preview plugin in VSCode

Conclusions

If you’re a Data Engineer, you may find it very practical to have one simple Python script that reverse-engineers a Snowflake database schema, and another simple Python script (or the same, extended) that generates DOT files from this loaded metadata.

The DOT files help render ER diagrams, with shapes and links, for the selected tables and their relationships. The layout is automatic, because of the power of Graphviz, an old and free extension to render SVG images.

You may copy and paste the generated content into a free online Graphviz-based viewer. Or you can get a DOT file visualizer plugin directly into your VSCode IDE, once you select a DOT file.

We lately generate directly HTML files, that you can open instantly in your browser. This was made possible by using the D3 Graphviz renderer.

--

--

Cristian Scutaru
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Multi-Cloud Data Expert (in AWS/Azure/GCP). Former Snowflake ”Data Superhero” and Certification SME.