Brazilian Companies

View data on TidyTuesday

This dataset lists companies in Brazil with their legal nature (sole proprietorship, publicly traded corporation etc.), size (small enterprise, micro enterprise, or other), and qualifications of the owner, as well as their capital stock.

I thought looking at the breakdown of size and type was interesting, as was a comparison by number of companies vs. capital making up each grouping.

Listing all the legal entities in the data overwhelmed the visual a bit, so I did some categorizing myself—all limited liability corporations, all types of partnerships, all co-ops, and then putting a bunch in an “Other” bucket.

Brazilian BreakdownNumber of companies by size and legal category
Brazil → Micro enterprise: 66202Brazil → Other: 42520Brazil → Small enterprise: 32610Micro enterprise → LLC: 51699Other → LLC: 36241Small enterprise → LLC: 31531Micro enterprise → Sole Proprietorship: 14162Small enterprise → Sole Proprietorship: 920Other → Sole Proprietorship: 127Other → Partnership: 2939Micro enterprise → Partnership: 172Small enterprise → Partnership: 150Other → Privately Held Corporation: 2892Micro enterprise → Privately Held Corporation: 3Small enterprise → Privately Held Corporation: 2Other → Other Type: 175Micro enterprise → Other Type: 166Small enterprise → Other Type: 7Other → Cooperative: 79Other → Publicly Traded Corporation: 52Other → State-Owned Enterprise: 15
Brazil
Micro enterprise
Other
Small enterprise
LLC
Sole Proprietorship
Partnership
Privately Held Corporation
Other Type
Cooperative
Publicly Traded Corporation
State-Owned Enterprise
Brazilian Breakdown (Capital)Combined capital (BRL) of companies by size and legal category
Brazilian capital → Small enterprise: 27300875940699Brazilian capital → Other: 21278265850351Brazilian capital → Micro enterprise: 1409569420479Small enterprise → LLC: 27300286627644Other → LLC: 20574951710324Micro enterprise → LLC: 1201120520850Other → Privately Held Corporation: 487636522361Micro enterprise → Privately Held Corporation: 565781600Small enterprise → Privately Held Corporation: 474000Micro enterprise → Sole Proprietorship: 172644830497Other → Sole Proprietorship: 6151224231Small enterprise → Sole Proprietorship: 410219550Other → Publicly Traded Corporation: 164485179899Micro enterprise → Other Type: 34537537836Other → Other Type: 3762019551Small enterprise → Other Type: 2300000Other → Partnership: 34095512890Micro enterprise → Partnership: 700749696Small enterprise → Partnership: 176319505Other → State-Owned Enterprise: 5399925267Other → Cooperative: 1783755827
Brazilian capital
Small enterprise
Other
Micro enterprise
LLC
Privately Held Corporation
Sole Proprietorship
Publicly Traded Corporation
Other Type
Partnership
State-Owned Enterprise
Cooperative

Queries

This was my staging table of sorts, where I pulled in the CSV, formatted company_size, and added a column categorizing legal_nature.

create or replace table companies as

 select * replace (replace(upper(company_size[1])||company_size[2:], '-', ' ') as company_size),
                   -- Turn `small-enterprise` into `Small enterprise`
        -- Group legal entities into categories
        case when legal_nature in (
                    -- pass-through
                    'Sole Proprietorship',
                    'Privately Held Corporation',
                    'Publicly Traded Corporation',
                    'State-Owned Enterprise') then legal_nature
             -- group types containing key phrases
             when legal_nature like '%Limited Liability%' then 'LLC'
             when legal_nature like '%Partnership%' then 'Partnership'
             when legal_nature like '%Cooperative%' then 'Cooperative'
             else 'Other Type'
              end as legal_category
   from 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-01-27/companies.csv'

To get the multi-level Sankey diagram, I needed to union all two sets of data—one just grouping by the company size with ‘Brazil’ as the source, and one grouping by size and legal category to count “flow” between them.

create or replace view companies_sankey as

   select 'Brazil' as source,
          company_size as target,
          count(*) as companies
     from companies
 group by all

union all

   select company_size,
          legal_category,
          count(*) as companies
     from companies
 group by all

 order by 3 desc, company_size, legal_category

I ordered by the third column because this translated between queries without having to change what that column was in the order by clause:

create or replace view capital_sankey as

   select 'Brazilian capital' as source,
          company_size as target,
          sum(capital_stock) as capital_stock
     from companies
 group by all

union all

   select company_size,
          legal_category,
          sum(capital_stock) as capital_stock
     from companies
 group by all

 order by 3 desc, company_size, legal_category