How to build a simple template engine with Python and regex - 02/02/2023

Here is the v1.0.0 of Mathew templating engine

Site under construction! Migrated content might be displayed ugly!

medium-to-markdown@0.0.3 convert node index.js https://medium.com/@birnadin/how-to-build-a-simple-template-engine-with-python-and-regex-ecb81d711ceb

How to build a simple template engine with Python and regex

[ Enigma Bits

](https://medium.com/@birnadin?source=post_page-----ecb81d711ceb--------------------------------)

Enigma Bits

5 min read·Feb 2, 2023

—

Listen

Prologue

As I mentioned previously, I want to create a static content creation system. The first step is A Template Engine. Rather than building a fully featured template engine, I am planning on what is just needed, in this major iteration.

I have also saved some bonus features for later major iterations 🎊.

With that being said, this major iteration (namely v1.0.0) will have 2 basic features:

Including external templates into another, OR Inheritance, I guess 🤔
Looping over a dataset to produce multiple pages.

Before anything, we should decide on the syntax. The generic one I have decided looks like…

{ macro\_name macro\_parameter }

Without further ado, let’s go 🏃‍♀️

1. Including external templates into another

For this, the syntax would look like this to embed another page called index.html into base.html

<html>  
<!- base.html ->  
 <head>…</head>  
 <body>  
   <!- some generic content ->  
   { include content.main }  
 </body>  
 </html>

<!- index.html ->

Welcome to SPC

```

So, what I want to do is to read through base.html and replace the line if {} is encountered. We could do this in many different ways, but an easy one is the regex way.

regex stands for Regular Expression

The usage of regex with python is much simple than other languages make it seem. If you want me to do a swing-by regex with python, please let me know in the comments.

So, to substitute the template we would do something like

import re # import the standard regex library  
pattern = r'{\\s?\\w+\\s(\\w+.\\w+)\\s?}' # regex pattern to search for  
specimen = """  
<html>  
 <head>…</head>  
 <body>  
 <! - some generic content →  
 { include content.main }  
 </body>  
</html>  
"""  
replace = "<h1>Welcome to SPC</h1>"  
parsed\_str = re.sub(pattern, replace, specimen) # using .sub() from library

Now if we write parsed_str to a file, will be the page we intended for. Now, let’s encapsulate it into a function for modularity and to be DRY. Thus, the function would be,

def eval\_include(specimen, replacement):  
 global pattern  
 return re.sub(pattern, replacement, specimen)

If you are disgusted by the global keyword, just so you know, I am coming from assembly language and Cheat-Engine 😜, I am pretty comfortable with it.

Now, an end user might use the library like…

from os.path import realpath  
from mathew.macros import eval\_include  
  
base = ""  
with open(realpath("templates/base.html"), "r") as b:  
 base = b.read()  
  
index\_template = ""  
with open(realpath("templates/index.html"), "r") as i:  
 index\_template = i.read()  
  
with open(realpath("out/index.html"), "w") as i:  
 i.write(  
   eval\_include(base, index) # do the templating magic 🧙‍♂️  
 )

Parsed page can be found in the out/ dir. File discovery and all other stuff will be automated later. For now, let’s just focus on one thing.

2. Looping over a dataset to produce multiple pages

Let’s say, we have a list of article titles to display on the homepage of the blog page. E.g.

<!- pubslist.html ->  
 <section>  
   <h2>Patrician Publications</h2>  
   { include pubsdetail.html }  
 </section>

<!- pubslistitem.html ->

{ eval pubs.title}

{eval pubs.cat }

{ eval pubs.sum }

`````` \# the dataset we like to map to {"pubs": \[ {"title": "Some 404 content", "cat": "kavik", "sum": "Summary 501"}, {"title": "Some 403 content", "cat": "eric", "sum": "Summary 502"}, {"title": "Some 402 content", "cat": "beric", "sum": "Summary 503"}, {"title": "Some 401 content", "cat": "manuk", "sum": "Summary 504"}, \] } ```

The dataset can be mapped to python’s dict without any additional logic. The difference between embedding another template from evaluating a variable and creating many pages by just replacing the data in the template appropriately and embedding the end-string to the destination template.

Let’s do it, shall we?

For evaluating the variable, we could use the Groups feature in the regex. That’s what the () around the \w+.\w+ in the pattern for. We can easily access the matched string slice by the .group() method on the match object returned by re lib-functions.

str\_1 = "Hello 123"  
pattern = r'\\w+\\s(\\d+)'  
digits = re.finditer(patter, str) # returns aggregation of \`match\` objects  
for digit in digits:  
 print(digit.group(1)) # 123

Notice we are calling for 1, not 0. Nothing that the lib is 1-index, it is 0-indexed but 0 index is the entire str, “Hello 123”

Remember the .sub() method, its second parameter accepts either str or a callable. This callable will get a match object as an argument for each matched pattern validates. So we can produce dynamic replacements based on each match like…

\# construct the key step-by-step  
key = m.group(1) # == pubs.title  
key = key.split(".") # == \["pubs", "title"\]  
key = key\[1\] # == "title"  
  
re.sub(  
 pattern,  
 lambda m: dataset\["pubs"\]\[i\]\[key\]  
)

If lambda is mysterious for you, it is a way to define an anonymous or inline function in python.

Defining functions for lib API be

\# map each datumset  
def \_\_eval\_map(string, data):  
 global pattern  
 return re.sub(  
   pattern, lambda m: data\[m.group(1).split(".")\[1\]\], string  
 )  
  
\# parse the batch of dataset  
def parse\_template(template, data):  
 return \[  
   \_\_eval\_map(template, datum)  
   for datum in data  
 \]

parse_template returns aggregated results using list comprehension syntax, if you are unfamiliar with the syntax let me know in the comment.

So, accessing the key to evaluate is just as breezy as…

from os.path import realpath  
from mathew.macros import parse\_template, eval\_include  
  
specimen = """  
<article>  
 <h4>{ eval pubs.title}</h4>  
 <span>{eval pubs.cat }</span>  
 <p>{ eval pubs.sum }</p>  
</article>  
"""  
dataset = {  
 "pubs": \[  
 {"title": "Some 404 content", "cat": "kavik", "sum": "Summary 501"},  
 {"title": "Some 403 content", "cat": "eric", "sum": "Summary 502"},  
 {"title": "Some 402 content", "cat": "beric", "sum": "Summary 503"},  
 {"title": "Some 401 content", "cat": "manuk", "sum": "Summary 504"},  
 \],  
 }  
  
\# parse each \`<article>\` tag for each list item  
parsed\_str = parse\_template(specimen, dataset\["pubs"\])  
  
\# join the \`<article>\` tag-group  
pubs\_list\_items = "".join(parsed\_str)  
pubs\_list\_template = ""  
with open(realpath("templates/pubslist.html"), "r") as p:  
 pubs\_list\_template = p.read()  
  
\# parse the \`pubs\_list\` itself  
parsed\_list = eval\_include(pubs\_list\_template, pubs\_list\_items)  
  
\# write the final file with base  
with open(realpath("out/pubs.html"), "w") as i:  
 i.write(  
   eval\_include(base, parsed\_list)  
 )

Final pubslist.html will be in out/ directory.

Done?

Not quite so. Did you notice the fact, that we still have to read the template string manually, have the data populate in a specific format and the parsing of the template is still manual.

These are for later. For now, we have a simple working template engine that does the job I intended it for. I am happy with it.

Another thing, keen eyes might have noticed is the macro_name in the template does nothing, in fact, if you swap include with eval or anything, as long as the latter part is valid, the script does its job. This is a bad design, but the worst part is our eval_include allows only one template. Gotta fix that!

Epilogue

If you are intrigued and interested, make sure you follow me on Medium or Twitter for follow up.

I guess I don’t have anything further, so I will just sign off, this is BE signing off.

Cover by Suzy Hazelwood