The Neural Networks of Pez.AI

Intro to data structures for Excel users

In this series of posts, we teach programming concepts from the perspective of spreadsheets using pez, Zato Novo’s data analysis language. If you know Excel, then you already have the foundation to start coding!

Data structures form the backbone of any programming language (and software system), and for computer science students it can send a shiver down their spine. But data structures don’t have to be intimidating. By the end of this post, you’ll be able to work with them confidently and efficiently.

So what is a data structure? Simply put they are containers that hold data. A spreadsheet is actually a massive data structure that represents data as a grid. Spreadsheets are good for displaying all the gory details of a (tabular) dataset but are cumbersome when moving data around or creating custom functions to modify data. Programming languages, on the other hand, provide compact notation for working with data structures but it can be cumbersome to see all of the data.

Most programming languages come with “batteries included”, meaning once it’s installed you have everything you need to immediately play with it. What’s implied is that all sorts of data structures are provided out of the box, which is great for variety but difficult to pick up and remember. Pez likes to err on the side of simplicity, so there are two primary data structures: lists and data frames. We’ll explore both of these structures using an example of creating financial projections for a startup.

Forecasting MRR

To make the lessons concrete, we’ll use a business forecasting example. In a previous article I showed how to use Panoptez to calculate the MRR of Slack using a basic set of assumptions. For this article, we’ll forecast the MRR of my startup, Zato Novo, based on an even simpler set of assumptions. As with the previous article, we establish a baseline approach using a Google Sheets document. This spreadsheet has a handful of columns, starting with the forecast date, followed by a projected number of paying customers. For pedagogical purposes, I’m assuming a fixed subscriber growth rate of 5% per month, which annualizes to 80%. Then I take that user number and multiply it by the base monthly price of $25/user to get a monthly recurring revenue number. To keep things simple, I’m ignoring tiers, annual prepay, and churn. This spreadsheet will be examined throughout the article as we walk through various concepts.


Working with lists

Okay, now let’s see how to construct the same thing in pez. Lists are an ordered collection of items and can contain any type of data. In a spreadsheet, a range of cells is analogous to a list. When we say an ordered list, items in the list are guaranteed to be in the same order as you entered them. This is like a spreadsheet where the value in A4 always follows the one in A3. In our revenue forecast example, each column is a list. It’s fine to treat each row as a list as well, although later we’ll see why it’s more convenient to think of lists as columns.

Let’s look at the first column that contains dates. In a spreadsheet we create this column by starting with an initial date. Next we define a formula that adds one month to create the next date (using EDATE in Google Sheets). We then copy and paste this formula for each successive cell to create the whole range. Our final date range lives in the cells A2:A25.


Notice that for each successive date, we are adding one month to the previous date. Hence, the second date adds 1 to the initial date, while the fourth date adds 3, and so on. In pez, we take advantage of this observation to create the dates more compactly. First, we create the initial date, which is simply the literal text 2016-01-01. If you enter dates with this specific format, pez knows that it’s a date, just like in a spreadsheet. (The same is true of timestamps.)

Now let’s create an integer range that represents how many months the initial date needs to be added to create the complete date range. For this we use the range operator, ... For example, 0..23 creates 24 integers, from 0 to 23. The final step is to create the dates, which simply requires adding this date to the list of numbers.

See how much simpler this is than copy and pasting a formula into a number of cells? In the spreadsheet, there is one other detail, which is that the column has a header. In pez, we just assign this expression to a variable, which we’ll call month. Here is what it looks like in our Panoptez-enabled Slack.


Literal list creation

We saw how easy integer ranges can be created in the previous section. What if you want to create a list that is not an integer range? In this case, a literal list can be created using bracket notation: [x1, x2, x3, ..., xn]. With this syntax, each element is specified explicitly within square brackets. Using the date range above, the first four elements can be created as [2016-01-01, 2016-02-01, 2016-03-01, 2016-04-01]. This approach is perfectly legal, but for efficiency, it’s often easier to think about using an expression to generate the appropriate range for you.

Learn more about lists

Element selection

So what can we do with this list? In a spreadsheet we can pull specific elements from a range and reference them in a separate cell using its coordinates. For example, January of 2017 is located at A14. This approach is convenient, but what happens if we move this column somewhere else? Let’s say we add one column to the left of A. Most of the time the spreadsheet automatically updates the cell references to reflect its new location. However, that means if we need to reference it anew, we need to know where it is in the spreadsheet! For complicated spreadsheets it can start to feel like a perverse Where’s Waldo exercise. Wouldn’t be nice if we could always reference the range using the same locations? In pez, our date range is called month, so any time we access month[13] we get the first day of 2017. That means no more missing references!

The operation using the name of the variable followed by brackets, x[y], is called indexing or subsetting. The number inside the brackets is called the index. In pez, the first element starts at an index of 1, while the last element is at length(x). There are other ways to index a list, but for now we’ll stick to the basics.

Compounding growth

Let’s move on to the second column, which contains a hypothetical user growth rate. Starting with an initial value of 100 users (hey, you gotta start somewhere), we assume a monthly growth rate of 5%. So growth is compounding monthly, meaning that each month is 1.05 times greater than the prior month. To model this in a spreadsheet, we again turn to a formula. This time the formula multiplies 1.05 to the previous value instead of adding a value.


In pez, there are a few ways to tackle this. One approach is to use the cumprod function, which takes a list of numbers and computes the cumulative product of all the numbers in the list from the first element to the current element. For example, cumprod 1..4 yields [1, 2, 6, 24], which is equivalent to [1, 1*2, 1*2*3, 1*2*3*4]. For the growth rate, we create a repeated list of 1.05 and apply cumprod to it.


Calling functions is similar to calling functions in a spreadsheet, where the name of the function is followed by its arguments wrapped in parentheses. Pez supports a simpler syntax as well, which will be discussed in a future post.

You may have noticed that there’s one problem with this approach. While the spreadsheet starts at 100, our pez list starts at 105. We need to modify the list to do this. However, an even simpler approach takes advantage of how compounding works. Since the compounding rate is constant, each compounding term raises the power of the compounding. Month one is just 1, while month two is 1.05, month three is 1.05^2, and so on. Using what we’ve already learned, we can raise 1.05 to the sequence 0..23, which produces all the powers for us!


Calculating the MRR

The last column to create is the monthly recurring revenue. The current assumption is $25/user/month, so we multiply each value in C2:C25 by 25.


In pez, the range C2:C25 corresponds to the variable customers, so we multiply that by 25 and assign its result to a new variable mrr.


Again, notice how simple it is to describe this operation.

Creating the data frame

The final step is to bring all these variables together into a single table. Data frames are organized by column, which is why we claimed that it’s best to think of lists as columns. Each variable we defined is simply a column in the table.

The output table is just like the spreadsheet. To make the table easier to work with, it’s actually better to assign our dates to the index of the table. This reduces the number of columns and sets the index to the dates. We use a special @index key at the end of the table definition to specify the index.

This looks pretty good. However, notice that we had to create a whole bunch of variables to create this table. This pollutes your workspace, which makes it harder to find useful stuff in the future. It’s better to use a let expression to define temporary variables instead.

Now only the variable you care about is created in your workspace. All the others are deleted once the let expression is evaluated.


As a final goodie, here is a plot of the MRR based on the data we created.



Data structures are an important part of programming. In this article, we took your existing knowledge of Excel and showed how cell ranges are lists and tables are data frames. You also got a taste of let expressions and vectorization, which are two powerful features of pez.

Panoptez is a collaborative data analysis and visualization platform accessible via chat systems, like Slack. Request an invite to the beta or contact us for preferred access.

How to calculate monthly recurring revenue (MRR) in Slack instead of Excel

FastCompany wrote an article about Slack, which cited some subscriber numbers. This got me wondering what their monthly recurring revenue (MRR) is based on these figures. The MRR is a key metric that helps determine if your company is cashflow positive or not. Knowing the MRR also gives you insight into a SaaS company’s P/E ratio. Since we don’t know if Slack is profitable, we can’t compute the P/E. We can, however, use price-to-revenue as a naive proxy. In this article, I show how to use Panoptez within Slack to calculate the MRR and P/R instead of Excel (or other spreadsheet program).

A spreadsheet (e.g. Excel, Google Sheets) is often the go-to tool when you want to make a quick back-of-the-envelope calculation. In isolation this is sufficient, but when sharing your calculation with others, it becomes more involved. Within a team, it’s also likely that you want to share your methodology or the function you wrote to your colleagues. In a spreadsheet this becomes a bit more challenging since usually it means writing a function in Visual Basic or something comparable and then figure out how to distribute that among your colleagues. For this article, we’ll ignore the sharing aspect and focus on only the calculations. Our baseline will be using Google Sheets to implement these values.

The Data

First, we need the raw data. In this case, it comes from FastCompany, which says Slack has 370,000 paid subscribers. Slack has two tiers of pricing, but FastCompany doesn’t break this out for us. The pricing itself comes from Slack, where they list the price of the standard and plus plans.


To get a single value for the MRR, we need to know how many people pay for the standard versus the plus tier. We also need to know how many pay month-to-month versus annually. Since these numbers aren’t available, we have to make assumptions for the proportion of subscribers in each plan as well as the ratio of subscribers paying month-to-month versus annually. My hand-waving guess is 70% pay for the standard tier and 30% pay for plus. I also assume that 70% of the standard tier pay month-to-month and 30% pay annually. For the plus tier I assume the opposite. If you have better assumptions, please let me know in the comments!

Spreadsheet Calculation

In a spreadsheet, the normal procedure is to populate cells with these values and add some labels for the rows and columns. Next we create a formula to hold some intermediate results. In our case this is the weighted monthly value of a user in the standard and plus tiers. The formula bar shows the computed value for the standard tier.


To get the MRR we tally those up and multiply by the number of paid subscribers. This gives us $3.44 million per month, or $41.3 million per year.


That means with a private valuation of $2.8 billion, the P/R is about 68. Remember, this doesn’t equate to the P/E, since we aren’t accounting for expenses, so the P/E will likely be much higher. This is a detail overlooked in the Business Insider article that you shouldn’t ignore.

Using Panoptez

Now let’s see how to do the same thing in Panoptez. First, we create a nearly identical table. Remember that since this is in Panoptez, once this table is created, any colleague on Slack can access this same table to use as they wish. We’ll create a data frame using { } notation and assign it to the variable slack_stats. In case you’re wondering, a “data frame” is a fancy way of saying “table”.


Here’s a text version so you can copy and paste into your Panoptez-enabled Slack.

Each list within the data frame represents a column of the table. In our spreadsheet, the first column of data represented the standard pricing tier. To reference it, we would create a range from B2:B6. Our data frame holds the same data, except we reference it as slack_stats$standard. The @index at the end of the data frame sets the row names for the table. If we don’t specify this, the rows will simply be numbered numerically.

To calculate the weighted value of each tier, we’ll create a temporary function. Since Panoptez tracks all variables created in your workspace, it can fill up with a bunch of garbage quickly. To reduce clutter, you can use what’s known as a “let expression” to create temporary variables that will disappear after the expression has been evaluated. The basic structure of a let expression is let x in y. In this example, we create a temporary function f and then apply it to slack_stats$standard. The function itself is doing the same thing as in the spreadsheet formula =B2 * (B3*B5 + B4*B6), except we use the dot product (the ** operator) instead of explicitly summing the two products. The value at x[1] corresponds to B2 in the spreadsheet, since that is the range we are passing to the function. If we had used slack_stats$plus instead, then x[1] would correspond to C2.


Putting it all together, we can take our let expression and use it inside a function! That means we can create a temporary function to simplify the overall calculation. This last step creates a function that accepts the number of paying users and calculates the MRR. Notice that the expression following the in is essentially the same as in the spreadsheet, which was =F2*(B8+C8). The difference is that instead of cell positions, we are using variables and functions. The variable u is equivalent to F2, while f(slack_stats$standard) evaluates to the same value as B8.


This is the code to try in your Panoptez-enabled Slack session.

To get the final result, we simply call this function like !pez slack_mrr 370000. The nice thing about having a function is that as Slack’s user base changes, we can call this function again to get the latest MRR.



In this post, I’ve shown how to use Panoptez to calculate an estimate of Slack’s MRR. I’ll leave it to reader to write an expression that calculates the P/R ratio from this. In a subsequent post, we’ll look at changing the assumptions used in this example.

Panoptez is a collaborative data analysis and visualization platform accessible via chat systems, like Slack. Request an invite to the beta or contact us for preferred access.

Data-driven collaboration just got a whole lot easier


Today’s Holy Grail is the data-driven organization. Like the Grail, nobody knows what it looks like, though many are on the difficult quest to find it. Becoming data-driven is hard, and many obstacles prevent organizations from reaching this goal. Two major obstacles include data inaccessibility and limited collaboration. The SaaS duo of Slack and Panoptez offers a shortcut around these challenges, getting you faster to data-driven bliss.

The Promise of Being Data-Driven

Data-driven organisations offer numerous advantages over traditional businesses. The promise is that data is a strategic asset that can “inform decision-making processes and drive actionable results” (IBM). When everyone has access to organizational data and analytical tools, “data empowers people to make decisions without having to consult managers three levels up” (VentureBeat). Taking advantage of data means that hunches can be replaced by hard data enabling even “junior employees to make decisions” (VentureBeat). Consequently, employees are empowered to make decisions and react faster to the market.

Competitive edge is often a byproduct of superior information. While all organizations are producing reams of data around their business and operational processes, data-driven organizations are capturing this data to make it usable. By quantifying processes and collecting this data, inefficiencies can be rooted out and new opportunities discovered. This is usually easy within an organizational silo, but it gets increasingly complicated as you span silos. Paradoxically, these insights typically have the most power. Examples include:

  • How do social media impressions affect sales inquiries?
  • Does content marketing engagement boost registrations?
  • How does lead velocity compare with expectations?
  • How do meetings affect productivity?
  • How do software releases affect customer support volume?
  • Do managers actually matter? (via Google)

Obstacles Along The Way

Most of the benefits of a data-driven organization are only possible when data is transparent and accessible to everyone in the organization. When data is not accessible, it’s near impossible to conduct a holistic analysis. Requests can easily be lost in a swamp of bureaucracy leading to lost opportunities. Furthermore, it creates bottlenecks in the decision-making process when only a handful of people can provide specific datasets.

But why is it so hard to become data-driven? Most IT systems were not designed for interoperability and data sharing, so getting data out of these systems is difficult. According to McKinsey, “existing IT architectures may prevent the integration of siloed information, and managing unstructured data often remains beyond traditional IT capabilities.

The traditional solution to this problem is to embark on a strategic IT project to integrate data together. While strategic initiatives can benefit a company in the long-term, you can’t hold your breath and wait for these projects to be completed. It’s quite often that “fully resolving these issues often takes years” (McKinsey). Protracted timelines are anathema to data-driven organizations — who has time to wait years for an answer?


What we really need is to quickly conduct an ad hoc analysis: immediate answers to immediate questions. In finance, desk quants served this purpose, answering complex analytical questions in near real-time. Not everyone needs the fire power of a quant but many do need immediate answers with the help of analytics.

When IT involvement is not an option, many people resort to spreadsheets as a way to get quick answers. Nowadays it’s fairly easy to collaborate on spreadsheets and get data into them. However, not all data is easily accessible. Operational data is largely buried in legacy systems lacking friendly APIs. Getting data out of spreadsheets for use in other analyses can also be tricky. Dashboards are similarly flawed since data is not easily shared across reports. Interactivity is also typically limited to drill-downs. But what if you want to quickly explore the relationship of one variable with another that might not be in the report? Now you have to ask the analyst that created the dashboard, which again creates bottlenecks!

Slack as a Data Hub

Thanks to Slack service integrations, all sorts of operational data are now appearing in Slack. Spanning all silos of an organization, this operational data feed in from sales, marketing, customer service, product development, etc. The key is that all these operational events transform Slack into a de facto data hub. Hence, data accessibility no longer requires a strategic initiative: any Slack user has access to organizational data immediately.

If an organization is using Slack, then a lot of attention is already in Slack making it a great canvas for collaboration. What if it were possible to conduct an analysis and visualize it directly in Slack? Then your colleagues don’t need to switch apps nor download anything because it’s right there. Now imagine if your colleague has an idea on how to improve the analysis. What if she could modify your analysis straight from Slack?

Panoptez for Collaborative Analytics

Slack can make data accessible to all, but to truly democratize data, it needs to be usable. Enter Panoptez. Panoptez is a collaborative analytics environment that integrates with messaging systems like Slack. As part of the offering, Panoptez automatically parses messages from service bots and transforms them into data structures. Without ceremony nor strategic initiatives, you get your operational data in usable form — as they are created.


And since it’s all accessible in your Panoptez environment, you can instantly conduct cross-silo analyses. Want to know how social media marketing campaigns affect customer service inquiries? How do customer support requests affect the velocity of development? Which projects are most tightly coupled? Are development priorities aligned with marketing messages? How much do meetings affect development velocity?

Here’s an example that compares the size of our Kanban Review queue, with the number of commits to bitbucket. The idea is that high commit activity might indicate hastily written code or bug fixes.


At the beginning and end of this series there appears to be outsized commit activity versus the change in the Review queue. Perhaps something in the Active queue is causing this issue? We can add that series to the plot and render it.


The other half of being data-driven is collaboration. Most analytics platforms claim collaboration but what they mean is presentation. Dashboards, videos, narratives are for presentation. Collaboration is about working together to arrive at a solution. Panoptez moves beyond comments and annotations on a report. Instead, business users and analysts can share actual data and functions so they can collaboratively conduct an analysis.

In our example, if a product manager wanted to dig deeper into the data, it’s right there in Slack. This statement gets the last few events on our Trello engineering board and has links to the cards in question.


In short, not only is your visualization instantly available to any user in your Slack channel, so are the commands and data to create the visualization. Panoptez democratizes your data and your analytics tools, so everyone within your organization is empowered to make decisions. With Panoptez, becoming data-driven doesn’t have to be an epic journey.

Learn More

Watch my webinar to learn more about how Panoptez makes collaboration easy. Ready to give Panoptez a test drive? Request an invite to our free 30 day trial while we’re in beta.