# PSR

## Description

Estimate results from the partial-sample regression model as described by Czasonis, Kritzman, and Turkington in their 2020 research paper (Journal of Portfolio Management, see reference link below).

{% embed url="<https://doi.org/10.3905/jpm.2020.1.167>" %}
Journal of Portfolio Management research paper: Addition by Subtraction (Partial Sample Regression)
{% endembed %}

One of our principals, Mark Kritzman, introduces this powerful model in a lecture at State Street's research retreat in 2020. View a recording on the lecture below.

{% embed url="<https://www.statestreet.com/events/statestreetlive/researchretreat2020/forecasting-technique-with-applications-kriztman.html>" %}
Partial Sample Regression Model in Practice (Recorded Lecture)
{% endembed %}

## Syntax

The following describes the function signature for use in Microsoft Excel's formula bar.

<pre class="language-excel-formula"><code class="lang-excel-formula">=PSR(whichStat, y, x, theta)
=PSR(whichStat, y, x, theta, <a data-footnote-ref href="#user-content-fn-1">"Name1", value1, ..., "NameN", valueN</a>)
</code></pre>

### Input(s)

| Argument      | Description                                                                                                                                                                                                                                                                                                                                                                                            |
| ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **whichStat** | <p>Required. String to specify statistic to return, use one of the following options:</p><p>   "similarity"</p><p>   "informativeness"</p><p>   "relevance"</p><p>   "scaledrelevance"</p><p>   "rank"</p><p>   "filter" = dummy vector to indicate relevant cross-sectional observations</p><p>   "weighted", "relevanceweighted"</p><p>   "yhat" = forecast value(s) for the dependent variable.</p> |
| **y**         | Required. Time series or matrix of dependent variables. This is typically the time series of your portfolios, managers, or asset class returns.                                                                                                                                                                                                                                                        |
| **x**         | Required. Time series or matrix of independent variables. This is typically a set of economic variables or factors.                                                                                                                                                                                                                                                                                    |
| **theta**     | Vector of predictor values, $$\hat{x}$$ , to use with the model parameters (coefficients) to forecast the response variable $$\hat{y}$$ . If this argument is empty, the function will assume the most recent cross-sectional values of the independent variables.                                                                                                                                     |
| **threshold** | Optional. Relevance threshold, numerical value to specify the minimum percentage of or relevant periods. If the argument is not specified, it defaults to 0.50 (at least 50% relevant periods will be included in the forecast of the partial-sample regression).                                                                                                                                      |

### Name-Value Optional Arguments

Specify optional pairs of arguments where Name is the option argument name and Value is the corresponding input object. Name-value arguments must appear after other input arguments above, but the order of these pairs does not matter.

Example:

```excel-formula
=PSR(whichStat, y, x, theta, "Name1", value1, "Name2", value2, ..., "NameN", valueN) 
```

| Name                   | Value                                                                                                                                                                                                                                                                              |
| ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **threshold**          | Threshold value to determine relevance cutoff. If the argument is not specified, it defaults to 0.50 (at least 50% relevant periods will be included in the forecast of the partial-sample regression). See also isPercentile option.                                              |
| **isPercentile**       | Logical, to indicate whether the threshold value is in percentile units or a level value, default = true.                                                                                                                                                                          |
| **thresholdDirection** | <p>Value to indicate the criteria to evaluate relevance against the threshold value set<br></p><p>   <span class="math">value \in\begin{cases} -1: & <  \\+1: & \geq  \end{cases}</span> </p><p></p><p>The default threshold direction is <span class="math">\geq</span> .<br></p> |
| **solveMaxFit**        | Logical (TRUE or FALSE) flag. If true, the regression model will solve for the maximum fit.                                                                                                                                                                                        |
| **selectVariables**    | Logical (TRUE or FALSE) flag. If true, the regression model will solve for maximum fit with the optimal selection of variables. If false, then the model will use all variables when solving for maximum fit.                                                                      |
| **covariance**         | Covariance matrix of the independent variables.                                                                                                                                                                                                                                    |

### Output(s)

The function's output will vary depending on the specification of the `whichStat` argument. The following table will describe the corresponding output result. For M-dependent variables (y) and N-independent variables (x) across T-observations:

| whichStat            | Output                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **yHat, prediction** | Forecast of dependent variable(s) from the partial sample regression model.                                                                                                                                                                                                                                                                                                                                                                 |
| **relevance**        | Tx1 vector of relevance scores. Relevance is the sum of statistical similarity and informativeness. I.e. Relevance is a measure of the importance of an observation to prediction. Its components are the informativeness of past circumstances, the informativeness of current circumstances, and the similarity of past circumstances to current circumstances.                                                                           |
| **similarity**       | Tx1 vector of statistical similarity, measured as the negative of the Mahalanobis distance of the past observations for the independent variables to the current values for the independent variables. Or put simply, past observations that are like the current observations are more relevant.                                                                                                                                           |
| **informativeness**  | Tx1 vector of informativeness as measured by the Mahalanobis distance of the historical observations of the independent variables from its average values.                                                                                                                                                                                                                                                                                  |
| **infoTheta**        | Tx1 vector of informativeness as measured by the Mahalanobis distance of the historical observations of the independent variables from the circumstances specified (theta).                                                                                                                                                                                                                                                                 |
| **weights**          | Tx1 Vector of partial sample regression weights.                                                                                                                                                                                                                                                                                                                                                                                            |
| **fit**              | 1xM Fit values. Fit is the average alignment between relevance and outcomes across all observation pairs for a single prediction. A large value indicates that the observations that are similarly relevant have similar outcomes, in which case on should have more confidence in the prediction. A small value indicates that relevance does not line up with the outcomes, in which case one should view the prediction more cautiously. |
| **filter, included** | Tx1 Dummy vector to indicate sub-sample periods that meet the threshold criteria.                                                                                                                                                                                                                                                                                                                                                           |

## Example

![Identify similarity, informativeness, and relevance - deepen your regression models](https://258561627-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MK66-MGuoULhqCDXLwy%2F-MQnsFcD4tEWFmlkjkIY%2F-MQnsIzpVS8whGPfNPMk%2Fimage.png?alt=media\&token=8cb41dd6-268d-4c90-bffc-580041408dec)

{% file src="<https://258561627-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MK66-MGuoULhqCDXLwy%2F-MQnsYQIsi58ykFESpMQ%2F-MQnseRt-2ltZy24msIf%2FPSR.xlsx?alt=media&token=c9e6c942-f69b-4bed-a945-f83771f8eef7>" %}
Example Workbook: PSR
{% endfile %}

[^1]: **Name-Value** pair input argument, variable length, see [Name-Value Optional Argument table](#name-value-optional-arguments) below.
