/

When Metadata, Modeling Habits & Macro Collide

When Metadata, Modeling Habits & Macro Collide

Apr 4, 2025

Black and white photo of a water splash collision

The Kickoff

April’s here, and so are the regime shifts. As tax deadlines near and trade policy takes centre stage, quants are navigating a market where assumptions age quickly. In this edition, we look at what’s shifting, what’s scaling, and where models may need rethinking as Q2 begins.

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • Newest partner additions to the Quanted data lake.

  • What we learned from Francesco Cricchio, Ph.D., CEO of Brain.

  • A closer look at what the data says about using websites for stock selection

  • Market insights/ conferences

  • How to model the market after Liberation Day.

  • A useful recommendation for any quant who's running on modeling habits.

  • A shocking fact we bet you didn't know about Medallion’s Sharpe ratio.

Insider Trading

Big strides this month on both the team and product side.

We’re excited to welcome Rahul Mahtani as our new data scientist. He joins from Bloomberg, where he built ESG and factor signals, portfolio optimisers, and market timing tools. With prior experience at Scientific Beta designing volatility tools and EU climate indices, he brings deep expertise in investment signals and model transparency.

On the product side, we’ve now rolled out the full suite of explainability tools we previewed in January: SHAP, ICE plots, feature interaction heatmaps, cumulative feature impact tracking, dataset usage insights—and new this month—causal graphs to interpret relationships between features and targets. We also crossed 3 million features in the system, marking a new scale milestone. More to come as we keep building—now with more hands on deck and even more momentum than ever before.

On the Radar

We’ve added BIScience, Brain, Flywheel, JETNET, and Kayrros to our data partner lake—expanding our alternative data coverage across advertising, sentiment, retail, aviation, and environmental intelligence. Each adds to the breadth of features we surface—highlighting what’s most uniquely relevant to your predictive model.

BIScience
Provides digital and behavioral intelligence with global coverage, offering insights into web traffic, advertising spend, and consumer engagement across platforms.

Brain
Delivers proprietary indicators derived from financial texts—such as news, filings, and earnings calls—by combining machine learning and NLP to transform unstructured text into quant-ready signals.

Flywheel
Providing investors with access to a vast eCommerce price & promo dataset that tracks changes in consumer product pricing, discounting and availability at 650+ tickers and thousands of private comps for the last 7 years.  

JETNET
Provides aviation intelligence with detailed data on aircraft ownership, usage, and transactions—delivering a comprehensive view of global aviation markets.

Kayrros
Delivers environmental intelligence using satellite and sensor data to measure industrial activity, emissions, and commodity flows with high spatial and temporal resolution.

The Tradewinds

Expert Exchange

We sat down with Francesco Cricchio, Ph.D., CEO at Brain, where he’s spent the last eight years applying advanced statistical methods and AI to quantitative investing. With a PhD in Computational Physics and years of experience leading R&D projects across the biomedical, energy, and financial sectors, Francesco has built a career solving complex problems — from quantum-level simulations in solid state Physics to machine learning applications in Finance.

In our conversation, he shared how a scientific mindset shapes his work in Finance today, the innovations driving his team’s research, and the importance of rigorous validation and curiosity in building alternative datasets and quantitative investment strategies.

Over your 8+ years leading Brain, how have you seen the alternative data industry evolve? What’s been the most defining or memorable moment for you?

When we started Brain in 2016, alternative data was still a niche concept — an experimental approach explored by only a few technologically advanced investment firms. Over the years, we’ve seen it transition from the periphery to the core of many investment strategies. A defining moment for us came in 2018, when a major U.S. investment fund became a client and integrated one of our Natural Language Processing–based datasets — specifically focused on sentiment analysis from news — into one of their systematic strategies.

Looking back, what 3 key skills from your physics-to-finance journey have proven most valuable in your career?

During my research in Physics, I focused on predicting material properties using the fundamental equations of Quantum Physics through computer simulations. It’s a truly fascinating process — being able to model reality starting from first principles and, through simulations, generate real, testable predictions.

Looking back, three key skills from that experience have proven especially valuable in my work within quantitative finance and data science:

  1. Analytical rigor: Physics taught me how to model complex systems and break them down into solvable problems. This structured approach is crucial when tackling the challenges of quantitative finance.

  2. Strong statistical foundations: The ability to separate signal from noise and
    rigorously validate assumptions has been essential for building robust and reliable indicators in quantitative research. A constant focus is on mitigating overfitting risk to ensure more stable and consistent out-of-sample performance in predictive models.

  3. Curiosity-driven innovation: Physics trains you to ask the right questions, not just solve equations. That mindset has been instrumental in helping Brain innovate and lead in the application of AI to financial markets.

Could you provide a couple of technical insights or innovations in quantitative data that you believe would be particularly interesting to investment professionals?

One example is the use of transfer learning with Large Language Models (LLMs), where models trained on general financial text can be fine-tuned for specific, domain-focused tasks. We successfully applied this approach in one of the datasets we launched last year: Brain News Topics Analysis with LLM. This dataset leverages a proprietary LLM framework to monitor selected financial topics and their sentiment within the news flow for individual stocks.

Before the advent of LLMs, building NLP models required creating custom architectures and training them from scratch using large labeled datasets. The introduction of LLMs has significantly simplified and accelerated this process. Moreover, with additional customization, LLMs can now be used to understand complex patterns in financial text — a capability that proved crucial when we developed a sentiment indicator for commodities. Given the inherent complexity of commodity-related news, this approach has been especially effective in identifying key factors driving supply and demand, as well as assessing the sentiment associated with those shifts.

As more data providers offer production-ready signals rather than raw inputs, how do you see the relationship between investment teams and data providers changing?

The relationship is becoming more collaborative and strategic. Investment teams no longer want just a data dump — they’re looking for interpretable, backtested, and explainable signals that can integrate into their decision-making frameworks. Providers like us are becoming more like R&D partners, co-developing solutions with clients and offering transparency into how signals are derived.

In line with this direction, we are also investing in the development of validation use cases
through our proprietary validation platform. This tool enables clients to assess the
effectiveness of our datasets by performing statistical analyses such as quintile analysis and Spearman rank correlation. Beyond supporting our clients, the platform has also proven valuable for other alternative data providers, offering an independent and rigorous framework to validate the performance and relevance of their own datasets.

Aside from AI, what other long-term trends do you see shaping quant research over the next 5, 10, and 25 years?

Of course, it’s very difficult to make specific predictions — especially over long horizons.
That said, in the short to mid-term, I foresee a growing adoption of customized Large
Language Models (LLMs) tailored to specific financial tasks. These will enable more
targeted insights and greater automation in both research and strategy development.

Over a longer time horizon, I believe we’ll see a convergence between human and machine
decision-making — not just in terms of speed and accuracy, but in the ability to understand causality and intent in financial markets. This would represent a major shift from traditional statistical models to systems capable of interpreting the underlying drivers of market behavior. 

Another significant challenge — and opportunity — lies in gaining a deeper understanding of market microstructure. It remains one of the most intricate and active areas of research, but I believe that in the long run, advancements in this field will dramatically improve how
strategies are developed and executed, particularly in high-frequency and intraday trading.

What is the next major project or initiative you’re working on, and how do you see it improving the alternative data domain? 

Our current focus — building on several datasets we've had in live production for years — is to add an additional layer of intelligence by combining multiple data sources into more robust signals. One of our key innovations in this area is the development of multi-source ensemble indicators, which aggregate sentiment metrics from sources such as news, earnings calls, and company filings into a single, unified sentiment signal. These combined indicators often outperform those based on individual sources and significantly reduce noise.

This philosophy is embodied in our recently launched dataset, Brain Combined Sentiment. The dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources — including news articles, earnings calls, and regulatory filings. These metrics are generated using various Brain datasets and consolidated into a single, user-friendly file, making it especially valuable for investors conducting sentiment-based analysis.

Finally, a project we are currently developing involves the creation of mid-frequency
systematic strategies. This initiative leverages proprietary optimization techniques and robust validation frameworks specifically tailored for parametric strategies within a machine learning context.

Looking ahead, is there anything else you'd like to highlight or leave on our readers’ radar?

We’re continuing building new datasets leveraging the latest advancements in A.I. and statistical methods. All our recent developments are published on our website braincompany.co and LinkedIn page.  

Numbers & Narratives

Wolfe Research's quants recently ran a study extracting 64 features from corporate websites—spanning 85 million pages—to build a stock selection model that delivered a Sharpe ratio of 2.1 and 16% CAGR since 2020. These features included structural metadata (e.g., MIME type counts, sitemap depth) and thematic divergence using QesBERT, their finance-specific NLP model.

This comes as web scraping enters a new growth phase. Lowenstein Sandler’s 2024 survey found that adoption jumped 20 percentage points—the largest gain across all tracked alternative data sources—with synthetic data also entering the landscape at scale. 

A few things stood out:

  • Sitemap depth ≠ investor depth: Bigger websites might just reflect broader business lines, more detailed investor relations (IR), or legal disclosure obligations. Without knowing why a site expanded, it’s easy to overfit noise disguised as structural complexity.

  • Topic divergence has legs: The real signal lies in thematic content. Firms diverging from sector peers in what they talk about may be early in narrative cycles—especially in AI or clean energy, where perception moves flows as much as fundamentals.

  • This isn’t just alpha—it’s alt IR: The takeaway isn’t that websites matter. It’s that investor messaging is now machine-readable. That reframes how we model soft signals.

For quants, this hints at a broader shift: public-facing digital communication is becoming a reliable, model-ready input. IR messaging, once designed purely for human consumption, is now being systematically parsed—creating a new layer of insight few are modeling. In other words, the signal landscape is changing. And the universe of machine-readable edge is still expanding.

  1. FT's article on Wolfe

  2. Lowenstein Sandler's Alt Data Report

Market Pulse

We’re adding something new this quarter. With so much happening across quant and data, a few standout conferences felt worth highlighting.

📆 Quaint Quant Conference; 18 April, Dallas | An event where buy-side, sell-side, and academic quants come together to share ideas and collaborate meaningfully. 

📆 Point72 Stanford PhD Coffee Chats and Networking Dinner; 24 April, Stanford | Point72 will be hosting coffee chats and a networking dinner for PhDs.

📆 Hedgeweek Future of the Funds EU; 29 April, London | A summit open to senior execs at fund managers and investors to discuss the future of fund structures, strategies, and investor expectations.

📆 Battle of the Quants; 6 May, New York | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆 Morningstar Investment Conference; 7 May, London | Morningstar’s flagship UK event covering portfolios, research, and the investor landscape.

📆 Neudata's New York Summer Data Summit; 8 May, New York | A full-day alt data event with panels, vendor meetings, and trend insights for investors.

📆 BattleFin's Discovery Day; 10 June, New York | A matchmaking event connecting data vendors with institutional buyers through curated one-on-one meetings.

Navigational Nudges

Quant strategies often account for macro risk, but political shocks are rarely treated as core modelling inputs. In a recent Barron’s interview, Cliff Asness explained why that approach needs to change when policy shifts start influencing fundamentals directly—through tariffs, regulation, or capital controls. 

He described how AQR evaluated tariff exposure across its portfolios by analysing supply chains and revenue by geography. That kind of work becomes essential when political risk transitions from a volatility driver to a capital flow and cash flow driver. Trump’s “Liberation Day” speech marks exactly that kind of transition. The introduction of sweeping new tariffs has altered expectations around inflation, corporate margins, and global trade flows—conditions quant models rely on to stay stable.

Why does this matter? Markets repriced quickly after the announcement: implied vol rose in trade-exposed sectors, EM FX weakened, and cross-asset dispersion increased. These are clear indicators of a regime shift. Models built under assumptions of tariff-free globalisation now face higher estimation error. Signal degradation, autocorrelation breakdowns, and unstable factor exposures are already visible in the post-speech data. 

What should you be doing now?

1. Quantify Trade Exposure at the Factor and Position Level
Use firm-level revenue segmentation, global value chain linkages, and import-cost pass-through data to surface tariff sensitivity. Incorporate this into both factor risk models and position sizing rules.

2. Test for Structural Signal Decay Post-Announcement
Run rolling window regressions and Chow tests to identify breakpoints in factor performance or autocorrelation. Validate whether alpha signals maintain stability across the new policy regime.

3. Decompose Trend Attribution by Policy Regime
Segment trend strategy PnL around political events and measure conditional Sharpe. Use macro-filtered overlays to isolate persistent price moves from reactive noise.

4. Run Event Studies to Quantify Short-Term Sensitivity to Policy Shocks
Use cumulative abnormal return (CAR) analysis across pre-, intra-, and post-event windows. Calibrate against benchmark exposures estimated prior to the shock (e.g., beta coefficients). Apply statistical tests to detect shifts in market-implied expectations of the strategy’s value or future cash flows.

There’s a difference between volatility and change. Quants know how to handle the former. The challenge now is recognising the latter.

The Knowledge Buffet

📖  Modeling Mindsets: The Many Cultures Of Learning From Data 📖

by Christoph Molnar 

Most quants have a default way of thinking—Bayesian, frequentist, machine learning—and tend to stick with what’s familiar or what’s worked in the past. Modeling Mindsets is a useful reminder that a lot of that is habit > preference. It doesn’t argue for one approach over another, but it does help surface the assumptions that shape how problems are framed, tools are chosen, and results are interpreted. Especially worth reading if a model’s ever failed quietly and the why wasn’t obvious.

The Closing Bell

Did you know?

The Medallion Fund’s Sharpe ratio is so high that standard significance tests stop being useful — because its returns are almost too consistent to model using standard risk metrics. 

The Kickoff

April’s here, and so are the regime shifts. As tax deadlines near and trade policy takes centre stage, quants are navigating a market where assumptions age quickly. In this edition, we look at what’s shifting, what’s scaling, and where models may need rethinking as Q2 begins.

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • Newest partner additions to the Quanted data lake.

  • What we learned from Francesco Cricchio, Ph.D., CEO of Brain.

  • A closer look at what the data says about using websites for stock selection

  • Market insights/ conferences

  • How to model the market after Liberation Day.

  • A useful recommendation for any quant who's running on modeling habits.

  • A shocking fact we bet you didn't know about Medallion’s Sharpe ratio.

Insider Trading

Big strides this month on both the team and product side.

We’re excited to welcome Rahul Mahtani as our new data scientist. He joins from Bloomberg, where he built ESG and factor signals, portfolio optimisers, and market timing tools. With prior experience at Scientific Beta designing volatility tools and EU climate indices, he brings deep expertise in investment signals and model transparency.

On the product side, we’ve now rolled out the full suite of explainability tools we previewed in January: SHAP, ICE plots, feature interaction heatmaps, cumulative feature impact tracking, dataset usage insights—and new this month—causal graphs to interpret relationships between features and targets. We also crossed 3 million features in the system, marking a new scale milestone. More to come as we keep building—now with more hands on deck and even more momentum than ever before.

On the Radar

We’ve added BIScience, Brain, Flywheel, JETNET, and Kayrros to our data partner lake—expanding our alternative data coverage across advertising, sentiment, retail, aviation, and environmental intelligence. Each adds to the breadth of features we surface—highlighting what’s most uniquely relevant to your predictive model.

BIScience
Provides digital and behavioral intelligence with global coverage, offering insights into web traffic, advertising spend, and consumer engagement across platforms.

Brain
Delivers proprietary indicators derived from financial texts—such as news, filings, and earnings calls—by combining machine learning and NLP to transform unstructured text into quant-ready signals.

Flywheel
Providing investors with access to a vast eCommerce price & promo dataset that tracks changes in consumer product pricing, discounting and availability at 650+ tickers and thousands of private comps for the last 7 years.  

JETNET
Provides aviation intelligence with detailed data on aircraft ownership, usage, and transactions—delivering a comprehensive view of global aviation markets.

Kayrros
Delivers environmental intelligence using satellite and sensor data to measure industrial activity, emissions, and commodity flows with high spatial and temporal resolution.

The Tradewinds

Expert Exchange

We sat down with Francesco Cricchio, Ph.D., CEO at Brain, where he’s spent the last eight years applying advanced statistical methods and AI to quantitative investing. With a PhD in Computational Physics and years of experience leading R&D projects across the biomedical, energy, and financial sectors, Francesco has built a career solving complex problems — from quantum-level simulations in solid state Physics to machine learning applications in Finance.

In our conversation, he shared how a scientific mindset shapes his work in Finance today, the innovations driving his team’s research, and the importance of rigorous validation and curiosity in building alternative datasets and quantitative investment strategies.

Over your 8+ years leading Brain, how have you seen the alternative data industry evolve? What’s been the most defining or memorable moment for you?

When we started Brain in 2016, alternative data was still a niche concept — an experimental approach explored by only a few technologically advanced investment firms. Over the years, we’ve seen it transition from the periphery to the core of many investment strategies. A defining moment for us came in 2018, when a major U.S. investment fund became a client and integrated one of our Natural Language Processing–based datasets — specifically focused on sentiment analysis from news — into one of their systematic strategies.

Looking back, what 3 key skills from your physics-to-finance journey have proven most valuable in your career?

During my research in Physics, I focused on predicting material properties using the fundamental equations of Quantum Physics through computer simulations. It’s a truly fascinating process — being able to model reality starting from first principles and, through simulations, generate real, testable predictions.

Looking back, three key skills from that experience have proven especially valuable in my work within quantitative finance and data science:

  1. Analytical rigor: Physics taught me how to model complex systems and break them down into solvable problems. This structured approach is crucial when tackling the challenges of quantitative finance.

  2. Strong statistical foundations: The ability to separate signal from noise and
    rigorously validate assumptions has been essential for building robust and reliable indicators in quantitative research. A constant focus is on mitigating overfitting risk to ensure more stable and consistent out-of-sample performance in predictive models.

  3. Curiosity-driven innovation: Physics trains you to ask the right questions, not just solve equations. That mindset has been instrumental in helping Brain innovate and lead in the application of AI to financial markets.

Could you provide a couple of technical insights or innovations in quantitative data that you believe would be particularly interesting to investment professionals?

One example is the use of transfer learning with Large Language Models (LLMs), where models trained on general financial text can be fine-tuned for specific, domain-focused tasks. We successfully applied this approach in one of the datasets we launched last year: Brain News Topics Analysis with LLM. This dataset leverages a proprietary LLM framework to monitor selected financial topics and their sentiment within the news flow for individual stocks.

Before the advent of LLMs, building NLP models required creating custom architectures and training them from scratch using large labeled datasets. The introduction of LLMs has significantly simplified and accelerated this process. Moreover, with additional customization, LLMs can now be used to understand complex patterns in financial text — a capability that proved crucial when we developed a sentiment indicator for commodities. Given the inherent complexity of commodity-related news, this approach has been especially effective in identifying key factors driving supply and demand, as well as assessing the sentiment associated with those shifts.

As more data providers offer production-ready signals rather than raw inputs, how do you see the relationship between investment teams and data providers changing?

The relationship is becoming more collaborative and strategic. Investment teams no longer want just a data dump — they’re looking for interpretable, backtested, and explainable signals that can integrate into their decision-making frameworks. Providers like us are becoming more like R&D partners, co-developing solutions with clients and offering transparency into how signals are derived.

In line with this direction, we are also investing in the development of validation use cases
through our proprietary validation platform. This tool enables clients to assess the
effectiveness of our datasets by performing statistical analyses such as quintile analysis and Spearman rank correlation. Beyond supporting our clients, the platform has also proven valuable for other alternative data providers, offering an independent and rigorous framework to validate the performance and relevance of their own datasets.

Aside from AI, what other long-term trends do you see shaping quant research over the next 5, 10, and 25 years?

Of course, it’s very difficult to make specific predictions — especially over long horizons.
That said, in the short to mid-term, I foresee a growing adoption of customized Large
Language Models (LLMs) tailored to specific financial tasks. These will enable more
targeted insights and greater automation in both research and strategy development.

Over a longer time horizon, I believe we’ll see a convergence between human and machine
decision-making — not just in terms of speed and accuracy, but in the ability to understand causality and intent in financial markets. This would represent a major shift from traditional statistical models to systems capable of interpreting the underlying drivers of market behavior. 

Another significant challenge — and opportunity — lies in gaining a deeper understanding of market microstructure. It remains one of the most intricate and active areas of research, but I believe that in the long run, advancements in this field will dramatically improve how
strategies are developed and executed, particularly in high-frequency and intraday trading.

What is the next major project or initiative you’re working on, and how do you see it improving the alternative data domain? 

Our current focus — building on several datasets we've had in live production for years — is to add an additional layer of intelligence by combining multiple data sources into more robust signals. One of our key innovations in this area is the development of multi-source ensemble indicators, which aggregate sentiment metrics from sources such as news, earnings calls, and company filings into a single, unified sentiment signal. These combined indicators often outperform those based on individual sources and significantly reduce noise.

This philosophy is embodied in our recently launched dataset, Brain Combined Sentiment. The dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources — including news articles, earnings calls, and regulatory filings. These metrics are generated using various Brain datasets and consolidated into a single, user-friendly file, making it especially valuable for investors conducting sentiment-based analysis.

Finally, a project we are currently developing involves the creation of mid-frequency
systematic strategies. This initiative leverages proprietary optimization techniques and robust validation frameworks specifically tailored for parametric strategies within a machine learning context.

Looking ahead, is there anything else you'd like to highlight or leave on our readers’ radar?

We’re continuing building new datasets leveraging the latest advancements in A.I. and statistical methods. All our recent developments are published on our website braincompany.co and LinkedIn page.  

Numbers & Narratives

Wolfe Research's quants recently ran a study extracting 64 features from corporate websites—spanning 85 million pages—to build a stock selection model that delivered a Sharpe ratio of 2.1 and 16% CAGR since 2020. These features included structural metadata (e.g., MIME type counts, sitemap depth) and thematic divergence using QesBERT, their finance-specific NLP model.

This comes as web scraping enters a new growth phase. Lowenstein Sandler’s 2024 survey found that adoption jumped 20 percentage points—the largest gain across all tracked alternative data sources—with synthetic data also entering the landscape at scale. 

A few things stood out:

  • Sitemap depth ≠ investor depth: Bigger websites might just reflect broader business lines, more detailed investor relations (IR), or legal disclosure obligations. Without knowing why a site expanded, it’s easy to overfit noise disguised as structural complexity.

  • Topic divergence has legs: The real signal lies in thematic content. Firms diverging from sector peers in what they talk about may be early in narrative cycles—especially in AI or clean energy, where perception moves flows as much as fundamentals.

  • This isn’t just alpha—it’s alt IR: The takeaway isn’t that websites matter. It’s that investor messaging is now machine-readable. That reframes how we model soft signals.

For quants, this hints at a broader shift: public-facing digital communication is becoming a reliable, model-ready input. IR messaging, once designed purely for human consumption, is now being systematically parsed—creating a new layer of insight few are modeling. In other words, the signal landscape is changing. And the universe of machine-readable edge is still expanding.

  1. FT's article on Wolfe

  2. Lowenstein Sandler's Alt Data Report

Market Pulse

We’re adding something new this quarter. With so much happening across quant and data, a few standout conferences felt worth highlighting.

📆 Quaint Quant Conference; 18 April, Dallas | An event where buy-side, sell-side, and academic quants come together to share ideas and collaborate meaningfully. 

📆 Point72 Stanford PhD Coffee Chats and Networking Dinner; 24 April, Stanford | Point72 will be hosting coffee chats and a networking dinner for PhDs.

📆 Hedgeweek Future of the Funds EU; 29 April, London | A summit open to senior execs at fund managers and investors to discuss the future of fund structures, strategies, and investor expectations.

📆 Battle of the Quants; 6 May, New York | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆 Morningstar Investment Conference; 7 May, London | Morningstar’s flagship UK event covering portfolios, research, and the investor landscape.

📆 Neudata's New York Summer Data Summit; 8 May, New York | A full-day alt data event with panels, vendor meetings, and trend insights for investors.

📆 BattleFin's Discovery Day; 10 June, New York | A matchmaking event connecting data vendors with institutional buyers through curated one-on-one meetings.

Navigational Nudges

Quant strategies often account for macro risk, but political shocks are rarely treated as core modelling inputs. In a recent Barron’s interview, Cliff Asness explained why that approach needs to change when policy shifts start influencing fundamentals directly—through tariffs, regulation, or capital controls. 

He described how AQR evaluated tariff exposure across its portfolios by analysing supply chains and revenue by geography. That kind of work becomes essential when political risk transitions from a volatility driver to a capital flow and cash flow driver. Trump’s “Liberation Day” speech marks exactly that kind of transition. The introduction of sweeping new tariffs has altered expectations around inflation, corporate margins, and global trade flows—conditions quant models rely on to stay stable.

Why does this matter? Markets repriced quickly after the announcement: implied vol rose in trade-exposed sectors, EM FX weakened, and cross-asset dispersion increased. These are clear indicators of a regime shift. Models built under assumptions of tariff-free globalisation now face higher estimation error. Signal degradation, autocorrelation breakdowns, and unstable factor exposures are already visible in the post-speech data. 

What should you be doing now?

1. Quantify Trade Exposure at the Factor and Position Level
Use firm-level revenue segmentation, global value chain linkages, and import-cost pass-through data to surface tariff sensitivity. Incorporate this into both factor risk models and position sizing rules.

2. Test for Structural Signal Decay Post-Announcement
Run rolling window regressions and Chow tests to identify breakpoints in factor performance or autocorrelation. Validate whether alpha signals maintain stability across the new policy regime.

3. Decompose Trend Attribution by Policy Regime
Segment trend strategy PnL around political events and measure conditional Sharpe. Use macro-filtered overlays to isolate persistent price moves from reactive noise.

4. Run Event Studies to Quantify Short-Term Sensitivity to Policy Shocks
Use cumulative abnormal return (CAR) analysis across pre-, intra-, and post-event windows. Calibrate against benchmark exposures estimated prior to the shock (e.g., beta coefficients). Apply statistical tests to detect shifts in market-implied expectations of the strategy’s value or future cash flows.

There’s a difference between volatility and change. Quants know how to handle the former. The challenge now is recognising the latter.

The Knowledge Buffet

📖  Modeling Mindsets: The Many Cultures Of Learning From Data 📖

by Christoph Molnar 

Most quants have a default way of thinking—Bayesian, frequentist, machine learning—and tend to stick with what’s familiar or what’s worked in the past. Modeling Mindsets is a useful reminder that a lot of that is habit > preference. It doesn’t argue for one approach over another, but it does help surface the assumptions that shape how problems are framed, tools are chosen, and results are interpreted. Especially worth reading if a model’s ever failed quietly and the why wasn’t obvious.

The Closing Bell

Did you know?

The Medallion Fund’s Sharpe ratio is so high that standard significance tests stop being useful — because its returns are almost too consistent to model using standard risk metrics. 

The Kickoff

April’s here, and so are the regime shifts. As tax deadlines near and trade policy takes centre stage, quants are navigating a market where assumptions age quickly. In this edition, we look at what’s shifting, what’s scaling, and where models may need rethinking as Q2 begins.

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • Newest partner additions to the Quanted data lake.

  • What we learned from Francesco Cricchio, Ph.D., CEO of Brain.

  • A closer look at what the data says about using websites for stock selection

  • Market insights/ conferences

  • How to model the market after Liberation Day.

  • A useful recommendation for any quant who's running on modeling habits.

  • A shocking fact we bet you didn't know about Medallion’s Sharpe ratio.

Insider Trading

Big strides this month on both the team and product side.

We’re excited to welcome Rahul Mahtani as our new data scientist. He joins from Bloomberg, where he built ESG and factor signals, portfolio optimisers, and market timing tools. With prior experience at Scientific Beta designing volatility tools and EU climate indices, he brings deep expertise in investment signals and model transparency.

On the product side, we’ve now rolled out the full suite of explainability tools we previewed in January: SHAP, ICE plots, feature interaction heatmaps, cumulative feature impact tracking, dataset usage insights—and new this month—causal graphs to interpret relationships between features and targets. We also crossed 3 million features in the system, marking a new scale milestone. More to come as we keep building—now with more hands on deck and even more momentum than ever before.

On the Radar

We’ve added BIScience, Brain, Flywheel, JETNET, and Kayrros to our data partner lake—expanding our alternative data coverage across advertising, sentiment, retail, aviation, and environmental intelligence. Each adds to the breadth of features we surface—highlighting what’s most uniquely relevant to your predictive model.

BIScience
Provides digital and behavioral intelligence with global coverage, offering insights into web traffic, advertising spend, and consumer engagement across platforms.

Brain
Delivers proprietary indicators derived from financial texts—such as news, filings, and earnings calls—by combining machine learning and NLP to transform unstructured text into quant-ready signals.

Flywheel
Providing investors with access to a vast eCommerce price & promo dataset that tracks changes in consumer product pricing, discounting and availability at 650+ tickers and thousands of private comps for the last 7 years.  

JETNET
Provides aviation intelligence with detailed data on aircraft ownership, usage, and transactions—delivering a comprehensive view of global aviation markets.

Kayrros
Delivers environmental intelligence using satellite and sensor data to measure industrial activity, emissions, and commodity flows with high spatial and temporal resolution.

The Tradewinds

Expert Exchange

We sat down with Francesco Cricchio, Ph.D., CEO at Brain, where he’s spent the last eight years applying advanced statistical methods and AI to quantitative investing. With a PhD in Computational Physics and years of experience leading R&D projects across the biomedical, energy, and financial sectors, Francesco has built a career solving complex problems — from quantum-level simulations in solid state Physics to machine learning applications in Finance.

In our conversation, he shared how a scientific mindset shapes his work in Finance today, the innovations driving his team’s research, and the importance of rigorous validation and curiosity in building alternative datasets and quantitative investment strategies.

Over your 8+ years leading Brain, how have you seen the alternative data industry evolve? What’s been the most defining or memorable moment for you?

When we started Brain in 2016, alternative data was still a niche concept — an experimental approach explored by only a few technologically advanced investment firms. Over the years, we’ve seen it transition from the periphery to the core of many investment strategies. A defining moment for us came in 2018, when a major U.S. investment fund became a client and integrated one of our Natural Language Processing–based datasets — specifically focused on sentiment analysis from news — into one of their systematic strategies.

Looking back, what 3 key skills from your physics-to-finance journey have proven most valuable in your career?

During my research in Physics, I focused on predicting material properties using the fundamental equations of Quantum Physics through computer simulations. It’s a truly fascinating process — being able to model reality starting from first principles and, through simulations, generate real, testable predictions.

Looking back, three key skills from that experience have proven especially valuable in my work within quantitative finance and data science:

  1. Analytical rigor: Physics taught me how to model complex systems and break them down into solvable problems. This structured approach is crucial when tackling the challenges of quantitative finance.

  2. Strong statistical foundations: The ability to separate signal from noise and
    rigorously validate assumptions has been essential for building robust and reliable indicators in quantitative research. A constant focus is on mitigating overfitting risk to ensure more stable and consistent out-of-sample performance in predictive models.

  3. Curiosity-driven innovation: Physics trains you to ask the right questions, not just solve equations. That mindset has been instrumental in helping Brain innovate and lead in the application of AI to financial markets.

Could you provide a couple of technical insights or innovations in quantitative data that you believe would be particularly interesting to investment professionals?

One example is the use of transfer learning with Large Language Models (LLMs), where models trained on general financial text can be fine-tuned for specific, domain-focused tasks. We successfully applied this approach in one of the datasets we launched last year: Brain News Topics Analysis with LLM. This dataset leverages a proprietary LLM framework to monitor selected financial topics and their sentiment within the news flow for individual stocks.

Before the advent of LLMs, building NLP models required creating custom architectures and training them from scratch using large labeled datasets. The introduction of LLMs has significantly simplified and accelerated this process. Moreover, with additional customization, LLMs can now be used to understand complex patterns in financial text — a capability that proved crucial when we developed a sentiment indicator for commodities. Given the inherent complexity of commodity-related news, this approach has been especially effective in identifying key factors driving supply and demand, as well as assessing the sentiment associated with those shifts.

As more data providers offer production-ready signals rather than raw inputs, how do you see the relationship between investment teams and data providers changing?

The relationship is becoming more collaborative and strategic. Investment teams no longer want just a data dump — they’re looking for interpretable, backtested, and explainable signals that can integrate into their decision-making frameworks. Providers like us are becoming more like R&D partners, co-developing solutions with clients and offering transparency into how signals are derived.

In line with this direction, we are also investing in the development of validation use cases
through our proprietary validation platform. This tool enables clients to assess the
effectiveness of our datasets by performing statistical analyses such as quintile analysis and Spearman rank correlation. Beyond supporting our clients, the platform has also proven valuable for other alternative data providers, offering an independent and rigorous framework to validate the performance and relevance of their own datasets.

Aside from AI, what other long-term trends do you see shaping quant research over the next 5, 10, and 25 years?

Of course, it’s very difficult to make specific predictions — especially over long horizons.
That said, in the short to mid-term, I foresee a growing adoption of customized Large
Language Models (LLMs) tailored to specific financial tasks. These will enable more
targeted insights and greater automation in both research and strategy development.

Over a longer time horizon, I believe we’ll see a convergence between human and machine
decision-making — not just in terms of speed and accuracy, but in the ability to understand causality and intent in financial markets. This would represent a major shift from traditional statistical models to systems capable of interpreting the underlying drivers of market behavior. 

Another significant challenge — and opportunity — lies in gaining a deeper understanding of market microstructure. It remains one of the most intricate and active areas of research, but I believe that in the long run, advancements in this field will dramatically improve how
strategies are developed and executed, particularly in high-frequency and intraday trading.

What is the next major project or initiative you’re working on, and how do you see it improving the alternative data domain? 

Our current focus — building on several datasets we've had in live production for years — is to add an additional layer of intelligence by combining multiple data sources into more robust signals. One of our key innovations in this area is the development of multi-source ensemble indicators, which aggregate sentiment metrics from sources such as news, earnings calls, and company filings into a single, unified sentiment signal. These combined indicators often outperform those based on individual sources and significantly reduce noise.

This philosophy is embodied in our recently launched dataset, Brain Combined Sentiment. The dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources — including news articles, earnings calls, and regulatory filings. These metrics are generated using various Brain datasets and consolidated into a single, user-friendly file, making it especially valuable for investors conducting sentiment-based analysis.

Finally, a project we are currently developing involves the creation of mid-frequency
systematic strategies. This initiative leverages proprietary optimization techniques and robust validation frameworks specifically tailored for parametric strategies within a machine learning context.

Looking ahead, is there anything else you'd like to highlight or leave on our readers’ radar?

We’re continuing building new datasets leveraging the latest advancements in A.I. and statistical methods. All our recent developments are published on our website braincompany.co and LinkedIn page.  

Numbers & Narratives

Wolfe Research's quants recently ran a study extracting 64 features from corporate websites—spanning 85 million pages—to build a stock selection model that delivered a Sharpe ratio of 2.1 and 16% CAGR since 2020. These features included structural metadata (e.g., MIME type counts, sitemap depth) and thematic divergence using QesBERT, their finance-specific NLP model.

This comes as web scraping enters a new growth phase. Lowenstein Sandler’s 2024 survey found that adoption jumped 20 percentage points—the largest gain across all tracked alternative data sources—with synthetic data also entering the landscape at scale. 

A few things stood out:

  • Sitemap depth ≠ investor depth: Bigger websites might just reflect broader business lines, more detailed investor relations (IR), or legal disclosure obligations. Without knowing why a site expanded, it’s easy to overfit noise disguised as structural complexity.

  • Topic divergence has legs: The real signal lies in thematic content. Firms diverging from sector peers in what they talk about may be early in narrative cycles—especially in AI or clean energy, where perception moves flows as much as fundamentals.

  • This isn’t just alpha—it’s alt IR: The takeaway isn’t that websites matter. It’s that investor messaging is now machine-readable. That reframes how we model soft signals.

For quants, this hints at a broader shift: public-facing digital communication is becoming a reliable, model-ready input. IR messaging, once designed purely for human consumption, is now being systematically parsed—creating a new layer of insight few are modeling. In other words, the signal landscape is changing. And the universe of machine-readable edge is still expanding.

  1. FT's article on Wolfe

  2. Lowenstein Sandler's Alt Data Report

Market Pulse

We’re adding something new this quarter. With so much happening across quant and data, a few standout conferences felt worth highlighting.

📆 Quaint Quant Conference; 18 April, Dallas | An event where buy-side, sell-side, and academic quants come together to share ideas and collaborate meaningfully. 

📆 Point72 Stanford PhD Coffee Chats and Networking Dinner; 24 April, Stanford | Point72 will be hosting coffee chats and a networking dinner for PhDs.

📆 Hedgeweek Future of the Funds EU; 29 April, London | A summit open to senior execs at fund managers and investors to discuss the future of fund structures, strategies, and investor expectations.

📆 Battle of the Quants; 6 May, New York | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆 Morningstar Investment Conference; 7 May, London | Morningstar’s flagship UK event covering portfolios, research, and the investor landscape.

📆 Neudata's New York Summer Data Summit; 8 May, New York | A full-day alt data event with panels, vendor meetings, and trend insights for investors.

📆 BattleFin's Discovery Day; 10 June, New York | A matchmaking event connecting data vendors with institutional buyers through curated one-on-one meetings.

Navigational Nudges

Quant strategies often account for macro risk, but political shocks are rarely treated as core modelling inputs. In a recent Barron’s interview, Cliff Asness explained why that approach needs to change when policy shifts start influencing fundamentals directly—through tariffs, regulation, or capital controls. 

He described how AQR evaluated tariff exposure across its portfolios by analysing supply chains and revenue by geography. That kind of work becomes essential when political risk transitions from a volatility driver to a capital flow and cash flow driver. Trump’s “Liberation Day” speech marks exactly that kind of transition. The introduction of sweeping new tariffs has altered expectations around inflation, corporate margins, and global trade flows—conditions quant models rely on to stay stable.

Why does this matter? Markets repriced quickly after the announcement: implied vol rose in trade-exposed sectors, EM FX weakened, and cross-asset dispersion increased. These are clear indicators of a regime shift. Models built under assumptions of tariff-free globalisation now face higher estimation error. Signal degradation, autocorrelation breakdowns, and unstable factor exposures are already visible in the post-speech data. 

What should you be doing now?

1. Quantify Trade Exposure at the Factor and Position Level
Use firm-level revenue segmentation, global value chain linkages, and import-cost pass-through data to surface tariff sensitivity. Incorporate this into both factor risk models and position sizing rules.

2. Test for Structural Signal Decay Post-Announcement
Run rolling window regressions and Chow tests to identify breakpoints in factor performance or autocorrelation. Validate whether alpha signals maintain stability across the new policy regime.

3. Decompose Trend Attribution by Policy Regime
Segment trend strategy PnL around political events and measure conditional Sharpe. Use macro-filtered overlays to isolate persistent price moves from reactive noise.

4. Run Event Studies to Quantify Short-Term Sensitivity to Policy Shocks
Use cumulative abnormal return (CAR) analysis across pre-, intra-, and post-event windows. Calibrate against benchmark exposures estimated prior to the shock (e.g., beta coefficients). Apply statistical tests to detect shifts in market-implied expectations of the strategy’s value or future cash flows.

There’s a difference between volatility and change. Quants know how to handle the former. The challenge now is recognising the latter.

The Knowledge Buffet

📖  Modeling Mindsets: The Many Cultures Of Learning From Data 📖

by Christoph Molnar 

Most quants have a default way of thinking—Bayesian, frequentist, machine learning—and tend to stick with what’s familiar or what’s worked in the past. Modeling Mindsets is a useful reminder that a lot of that is habit > preference. It doesn’t argue for one approach over another, but it does help surface the assumptions that shape how problems are framed, tools are chosen, and results are interpreted. Especially worth reading if a model’s ever failed quietly and the why wasn’t obvious.

The Closing Bell

Did you know?

The Medallion Fund’s Sharpe ratio is so high that standard significance tests stop being useful — because its returns are almost too consistent to model using standard risk metrics. 

Click here to

Stay in the loop with
The Tradar Newsletter

Stay in the loop with The Tradar

Gain valuable market insights, exclusive interviews & updates on our technology

Quanted Technologies Ltd.

Address

71-75 Shelton Street
Covent Garden, London
United Kingdom, WC2H 9JQ

Contact

UK: +44 735 607 5745

US: +1 (332) 334-9840

Quanted Technologies Ltd.

Address

71-75 Shelton Street
Covent Garden, London
United Kingdom, WC2H 9JQ

Contact

UK: +44 735 607 5745

US: +1 (332) 334-9840

Quanted Technologies Ltd.

Address

71-75 Shelton Street
Covent Garden, London
United Kingdom, WC2H 9JQ

Contact

UK: +44 735 607 5745

US: +1 (332) 334-9840