## The Alt-Ac Job Beat Newsletter Post 15

Hi Everyone,

To the new people on the email list welcome! I suggest perusing the historical posts to see my tidbits of advice and job board link.

Part of what I wish I knew about in grad school was simply the job options that existed. I really at most knew about RAND/Urban as other possibilities besides going the professor route. I went to school in Albany, and I was aware of several students working at DCJS and OCFS in research capacity just through second degree connections. Anymore many of those state gigs are better in terms of salary than many professor positions (and they are just as safe/long term as professor gigs as well).

There is a big world out there, I would suggest just perusing the job board link, not for a specific job but to just get the idea of different companies, roles, and opportunities. I intentionally put on jobs I think many people on the list would be qualified for as-is. Your Phd is good for more than writing papers!

### JOBS

For some of the recent gigs:

- California DOJ, 80-108k, SPSS/R
- Microsoft Principal Data Scientist, 134k-282k, machine learning and AI
- City of Denver, Senior Racial Equity Data Analyst, 82k-136k, Stata/R/Python/SQL/Dashboards

California DOJ position looks similar to the positions at DCJS I mentioned earlier in the post. So the name may change from state to state, but I am guessing most states have similar positions, mostly in the capital.

### EXAMPLE SCIENTIST

Robert Morris, PhD in criminal justice from Sam Houston. Robert at first started his own predictive analytics company focusing on predictive modeling in industrial applications. And he now is a Chief Science Officer in a FinTech lending startup. The skills you learn now can be applied in a wide array of areas, and Robert is a great example of that.

### TECH ADVICE

This is more stats advice than tech advice. If you have a set of predicted probabilities, you can calculate various metrics of interest. So say I gave you a set of predicted probabilities:

```
100 - 0.1
50 - 0.3
50 - 0.6
```

The *expected* number of events happening in this sample is the sum of the probabilities, so `100*0.1 + 50*0.3 + 50*0.6 = 55`

. So say I worked for the IRS, and this predicted probability is whether someone messed up their taxes. If I had agents audit these 200 tax returns, I would expect to have around 145 false positives. If your model is well calibrated, your predicted probabilities and expected number of events should be close to one another when evaluated over many future events.

Another example I like to give, pretend I had an option of two tax returns to audit:

```
Option 1: Underpay estimate 1 million, probability fraud 10%
Option 2: Underpay estimate 5 thousand, probability fraud 50%
```

Many people here will say you should audit the return with the higher probability of fraud. If the goal is to recoup the most funds though, it probably makes more sense to audit option 1. The expected return for Option 1 is `0.1*1,000,000 = 100,000`

, whereas the expected return for option 2 is `0.5*5,000 = 2,500`

.

Best, Andy Wheeler