Fates RN Regression Project

Crimean Archivist · February 25, 2016

[Project officially closed as of May 3.]

We haven't yet figured out whether Fates uses 1 RN, 2 RNs, or some new system, so I've taken it upon myself to figure that out. I'm taking all data points from all sources to try to fit one model or another, and I've got an Excel sheet keeping track of all hit/miss ratios for all values 1-99. It's a daunting task, though; I have 1000 points as of starting this thread, and I estimate it may take 30,000 or more, so please, all help is welcome.

Guidelines for Recording Values:

Dual Guards don't count -- Dual Guard procs first.
DO NOT use data from reloading Battle Saves. See below. Reloading chapters from the beginning is fine.
Not necessary to maintain records for 0 and 100
Just get the value (1-99) and the outcome (H/M) -- circumstances not necessary
Use either Excel or .csv formats -- those will be easiest to merge into the existing models

You can now record and view collected data in real-time on a spreadsheet on Google Drive.

https://docs.google.com/spreadsheets/d/1UaKECEGX7Dyb_JhduG6j_SPDkphsKELdVZvr2mAv5aU/edit?usp=sharing

On the "Data" (front) tab, go to a column and record data in this format, aligned with the appropriate headers.

[Hit Rate], [Outcome]

This will ensure that your data is integrated into the results in real-time. The sheet has a large number of points and functions on it already, so it is fairly normal for it to lag a little on loading or when changing formulas, so be patient.

Thanks in advance for your help!

Latest Version: 23873 data points, 3:56 pm, May 3, 2016. Using data from myself, Balcerzak, XeKr, wvype, Verile, Minternets, myith, Vascela. Confirmed that the game uses 1-RN for values below 50 and weighted (3A+B)/4 for values above 50. The project is now closed.

Eliminated possibilities:

1-RN (upper range)
Unweighted 2-RN (all ranges)
(3A+2B)/5 Weighted 2-RN (all ranges)
Dynamic 1-RN (all ranges)

Edited May 3, 2016 by BlenD

ruadath · February 25, 2016

Glad to see someone is doing this. Good luck!

shadowofchaos · February 25, 2016

Dual Attacks are okay, yes?

XeKr · February 25, 2016

Do you mind explaining in brief technical terms your methodology and also any problem with the following proposal? (I haven’t taken stats in a few years)

A better experiment would be to test the same hit rate (something around 75% disp hit) repeatedly, and then you could observe the percentage of hits. Then you could say adopt the 1RN system as the null hypothesis, and reject it to a certain statistical significance in favor of the 2rn alternative. We would be assuming that it's the same system that accounts for 75% hit as the other hit rates, but that seems fairly safe as it would be silly to program otherwise. This should require far less than thousands of data points, as the 1 rn and 2 rn systems differ by a lot, around 12.75%.

For margin of error at 99% confidence to be less than 12.75%, it should only require n = 103 data points (MoE = 1.29/sqrt(n), we have an infinite size population).

edit: I suppose I should clarify we might want different margin of error depending on what the sample proportion of hits is, but that's not too hard.

Edited February 25, 2016 by XeKr

GoXDS · February 25, 2016

Do you mind explaining in brief technical terms your methodology and also any problem with the following proposal? (I haven’t taken stats in a few years)

For margin of error at 99% confidence to be less than 12.75%, it should only require n = 103 data points (MoE = 1.29/sqrt(n), we have an infinite size population).

edit: I suppose I should clarify we might want different margin of error depending on what the sample proportion of hits is, but that's not too hard.

he did mention as a third option that it may be something completely new. thus going through the whole range might answer that

Balcerzak · February 25, 2016

I replayed turn one of Ch 6 (Hoshido) roughly 10 times for this set. I might give it another go for more samples later, but I got bored.

70,	H		
77,	H		
70,	H		
89,	H		
76,	H		
70,	H		
77,	H		
82,	H		
86,	H		
70,	M		
77,	M		
70,	H		
89,	H		
76,	H		
70,	H		
77,	H		
70,	H		
86,	H		
82,	H		
72,	H		
92,	H		
89,	H		
20,	M		
89,	H		
70,	M		
77,	H		
70,	H		
89,	H		
76,	H		
70,	H		
77,	H		
82,	H		
86,	H		
72,	M		
92,	H		
89,	H		
20,	M		
89,	H		
70,	M		
77,	H		
70,	M		
89,	H		
76,	H		
70,	M		
77,	H		
96,	H		
86,	H		
73,	H		
86,	H		
72,	M		
92,	H		
89,	H		
20,	M		
89,	H		
70,	H		
77,	H		
70,	H		
89,	H		
76,	H		
70,	M		
77,	H		
96,	H		
86,	H		
73,	H		
86,	H		
72,	H		
92,	H		
89,	H		
20,	M		
89,	H		
70,	H		
77,	H		
70,	H		
89,	H		
76,	H,	crit,	3%
70,	H		
77,	H		
82,	H		
92,	H		
69,	H		
92,	H		
72,	M		
92,	H		
89,	H		
20,	H		
89,	H		
70,	M		
77,	H		
70,	H		
89,	H		
76,	H		
70,	H		
77,	M		
70,	H		
92,	H		
86,	H		
72,	H		
92,	H		
89,	H		
20,	H		
89,	H,	crit,	1%

Crimean Archivist · February 25, 2016

Dual Attacks are okay, yes?

Dual Attacks are fine and I have not been annotating them in my own data, because I'm currently operating under the assumption that while the Hit calculations are different, they are applied to the same roll-checking process.

Crimean Archivist · February 25, 2016

Do you mind explaining in brief technical terms your methodology and also any problem with the following proposal? (I haven’t taken stats in a few years)

For margin of error at 99% confidence to be less than 12.75%, it should only require n = 103 data points (MoE = 1.29/sqrt(n), we have an infinite size population).

edit: I suppose I should clarify we might want different margin of error depending on what the sample proportion of hits is, but that's not too hard.

I'm trying to avoid confirmation bias, so we need either a concentrated pool of very localized data or a very large, very broad pool of data. Even so, 103 times would not necessarily be conclusive. The probability of any given outcome can be calculated by the Binomial Theorem applied to whatever Hit we use, and we would hope that one would decay more quickly than the other, but even so, we have to match two margins of error at least -- one for 1RN and one for 2RN. If we have overlap between the margins of error, the results are still inconclusive.

Also, data too close to 0 or too close to 100 provides virtually useless data on its own. If I run 100 trials at 99 Hit and get 0 misses, that is within predicted ranges for both 1RN and 2RNs. On the other hand, 2RN predicts an expected value of 5050 hits out of 10000, 1 RN predicts 5000 out of 10000, which is barely measurable, so too close to the middle isn't useful either. The best data comes from 20-40 and 60-80, where the discrepancy between models is high and it is also easy to measure that discrepancy. My best data concentration so far is from around 75-85, and still my highest point count for any individual number is 59 points at 87.

XeKr · February 25, 2016

Well yes wrt to the n=103 not being conclusive, but it’d be to 99% confidence (assuming the sample proportion was 0.8775 or higher so outside the 0.1275 margin, obviously nonsensical without actual observations). Higher n would even better confidence, but it’s diminishing returns since 99% is already very high. I would feel better with higher n too, but I don’t think it has to be 30,000. The margin scales with sqrt of n, so it’s not very efficient to do those large values. Like a few hundred should be sufficient. There is no confirmation bias if we’re only testing 1 hit rate repeatedly.

I’m not sure I fully understand the 2 margins of error. Doesn’t margin of error apply to the sample mean/proportion? The predicted hit rate for either 1RN or 2RN, which we know since we know what disp hit we tested at, would either be within that margin or not. We would however in the worst case scenario need a lower margin if our sample proportion turned out to be near 0.81375, as a margin of 0.06375 would be necessary to distinguish from either RNG system. (n = 410 for 99% confidence, more is better. I should have mentioned this case instead of the n=103 one, admittedly)

And your last point is why I suggested something around 75%. This clearly doesn’t work for values near 0, 50, 100 given how close 1RN and 2RN are there and how small a margin is necessary to distinguish them.

I guess my question is, what statistical comparison is being used for very large, broad data sets (that might require n = 30,000), and what are the advantages compared to testing a disp hit of 75% (or 25%) around n = 500 times? (10 of us can each do 50 or something, we might want to find a useful place for testing earlygame though. -Skl against Xander or +Skl vs Kaze?)

Also, thanks for doing this and this discussion. It's very helpful.

Edited February 25, 2016 by XeKr

Crimean Archivist · February 25, 2016

Long story short, concentrated data is preferred by far. As you say, the two models predict vastly different values in the limit of however far we go, and so with enough data points at the right locations, it could be over very quickly. However, we need a lot of values regardless, so I'm casting a wide net so that whether we get a lot of concentrated values or relatively distributed ones, it's still useful. For one, I thought a lot of people would be deterred by say, replaying the same map to farm results from the same values over and over again. As for myself, I'm picking up values as I run the maps, because I'm still on my first run and don't want the project to spoil my game experience.

The biggest advantage of collecting broad-range data is that it's very difficult to force the same Hit chance many times in a row, so it's simple to execute, and if I get a cluster of results, like say, from 75 to 80, I could normalize my 75 to 77 range to apply to Hit = 76, weighting the values according to their frequency to minimize error, I could check to see how close I am to 76% success rate versus 88.72%. In that case, I could say that I would need slightly more points (say 500 as opposed to your predicted 410) but I could pull them from 3-5 different near-equal cases.

Excel is doing the heavy work, so I'm not averse to collecting data from all over the place if the trendlines fit. Right now, where the maximum n is 65, it's a jagged mess, but it should only take a few days of diligent work. Maximum n for a 3-point range is 100 even at 84-86, and that looks promising long-term as well.

Edited February 25, 2016 by BlenD

Soonaynay · February 25, 2016

How should we send you the files?

Crimean Archivist · February 25, 2016

You can attach files here using More Reply Options > Attach Files or do the same in a message to me, either way I'll get it. I don't expect anyone's data will be over the file limit.

I'm still new to the forums so there may be a more efficient way, but that's the one I know.

Crimean Archivist · February 25, 2016

While I'm thinking about it, the ideal testing case would be to take certain characters and certain enemies on a map with no time constraint, hack their stats to make them deal 0 damage to each other, and fix everyone's hit rate at 75. Then all the points we would need could be gathered very quickly by just repeatedly ending the turn with those units next to each other. So if anyone wants to do that, it could be done in a night.

VincentASM · February 25, 2016

If you have a 3DS on 10.5 firmware and a SD card reader or new Nintendo 3DS (and 1.5 GB space on your SD card), you can easily run a hacked version of Fates.

Just use Braindump (pre-release) to dump your romfs, SciresM's romfs extractor and builder to modify the romfs contents, then HANS to load the modified romfs. I may have some Nightmare modules to edit files as well.

I'd help, but I updated to 10.6 already : |

Crimean Archivist · February 25, 2016

I just checked and I'm on 10.6 as well. Old-fashioned way for me, at least.

Edited February 25, 2016 by BlenD

Dark Holy Elf · February 25, 2016

I don't have much to add, except to thank you for doing this. I was definitely wondering if we had 1 RN or 2 RN this time (or something else).

XeKr · February 26, 2016

Okay so some relatively easy options.

If you have a Branch of Fate lunatic save (ideally with animations off, and other difficulties might work too), start from Chapter 6 Hoshido side and have Hinoka attack the Thief. She has 76% chance to hit twice (3% crit).

Kaze is also good here, mine has 75% next to Rinkah attacking the Fighter twice (and Rinkah Tagteam is 71%), but he does have a few levels. Several other characters have hit rates around 70-80% that can also be recorded if desired.

(I wish I tried this sooner, especially since I just realized it's essentially what Balcerzak did...)

Or...

Start a Normal difficulty save, +Skl/-Def (w/e not Bane Luck). In Chapter 1, set the animations to off and skip enemy turn, so you don't need to do it every reset.

Get to Chapter 2. Move Corrin to the right turn 1, wait turn 2 for Kaze to come down. Then attack Kaze turn 3. If you got 1 Skl vs. Xander, the hit rate should be 76. If you got 1 Luck, the hit rate should be 75. If you got neither, the hit rate should be 74. So they're all in the 3 point range, no need to rig any levels.

Here's some data, I'll collect more when I get a chance.

76, H
76, H
76, H
76, H
76, H
76, H
75, M
75, H
75, H
75, H
75, M
75, H
75, H
75, H
75, H
75, H
75, H
75, H
75, M
75, H
75, H
74, H

Also I am playing the Japanese version but I doubt that matters. I can't imagine difficulty matters either but eh (all those numbers are normal except the last two 76s and 75s which are lunatic)..

Edited February 26, 2016 by XeKr

Crimean Archivist · February 26, 2016

So I tried to use Normal/Casual for the Battle Save function with XeKr's method because I got 75 Hit on the nose and wanted as many data points as fast as possible. After 32 attempts at this method, I got a 32/0 H/M ratio. I cannot confirm from this little data whether or not loading a Battle Save generates new RNs. However, after 3 tries reloading Chapter 1 I got 2/1 H/M ratio, so at least I know that reloading a chapter generates new RNs. I am excluding the Battle Save data until we know more.

EDIT: When Battle Saving at the beginning of Turn 3 every time I restarted Ch 1 until I got a miss, then reloading that Battle Save, I got a hit/miss ratio of 0/10. The probability of this occurring in a 1RN system (the higher of the two in this case) is less than one-thousandth of a percent. Given the similar data after loading Battle Saves after a hit, I'm going to say that it's extremely likely if not altogether confirmed that the game generates RNs at the beginning of a chapter and preserves those when loading Battle Saves. Ergo, don't use Battle Saves to farm your data.

Edited February 26, 2016 by BlenD

Crimean Archivist · February 26, 2016

Do we know how skills such as Astra are treated yet? Ryoma just proc'ed it for me and I don't know whether to count it as 1 hit or 5. The crit RN definitely rolls individually, but other than that I'm not sure.

XeKr · February 26, 2016

Each hit is rolled separately. Each hit can crit, but cannot activate skills. Each strike also counts as an attack wrt building or using up the Dual Guard meter.

And it seems the battle saves work like Awakening (iirc). Forgot to mention it but yeah, you have the reset the chapter (it's why I detailed all 3 turns).

Also, soft reset is L+R+Start, just in case for anyone.

edit: if it matters, here is a video. [spoiler=Conquest 25]https://youtu.be/A43Dg5mwnSA?list=PLAV-xzjVBR0XPbHsLonTjLW5FlQJuuLLN&t=1924

~~Ryoma's hit sucks so it should happen soon anyway.~~

Edited February 27, 2016 by XeKr

Crimean Archivist · February 27, 2016

Most of my data so far has been collected across just playing the game, and now at almost 2000 points of data I thought it might be useful to highlight some of it. I have several hundred data points between 70 and 90 right now, thanks to largely consistent hit chances throughout the game, and here's what those results look like:

Chance Hits Misses

70 33 10

71 17 10

72 24 5

73 23 6

74 22 4

75 3910

76 46 7

77 43 7

78 30 6

79 55 6

80 41 6

81 54 6

82 38 3

83 28 0

84 59 3

85 70 2

86 56 1

87 76 7

88 26 1

89 57 2

90 45 3

In general, hit counts are above the 1RN prediction level, with just a few (like 71 here) below. I've also been maintaining a chart of the relative hit-miss ratios for each value, and that currently looks like this:

All of the data is normalized to show the relative hit/miss ratio at each point, but doesn't show how many data points there are contributing to that value. High hit values are generally well-represented, with at least 25 points apiece, while hit values less than 40 are sometimes represented by less than 5 points. The graph is accordingly more erratic on the low end than the high end.

Oddly enough, if one draws a line from (1, 0) to (99, 100), the majority of hit values have hit rates above the line. Obviously we expect this from a 2RN system on the high end, so that isn't that strange, but there are plenty of low numbers well above it as well. If the system used is one we've seen before, this should diminish with time, but for now it's a curiosity at least.

Edited February 27, 2016 by BlenD

Crimean Archivist · February 28, 2016

I spent all of this morning farming data from Chapter 2 at 75 Hit. I now have 200 points at that value.

Results:

160 hits, 40 misses (yes, exactly)

99% confidence interval: 72.73 to 87.27 -- calculated by P = Hits/Trials, P +/- 2.57*sqrt(P(1-P)/Trials) in Excel

I checked this interval against an online calculator using the Clopper-Pearson (precise) method, and that gave 71.782 to 86.743, so there is some discrepancy due to rounding/formula exactitude, but not much.

Notably, the 2-RN expected value (87.75) has fallen outside of the confidence interval, although it still pops back in from time to time. The strange thing is that the data is centered at 80, as opposed to either predicted value. I expected the ratio to eventually normalize to one or the other, but it hasn't. Continuing trials in hopes of the values normalizing to 75 -- if they don't, it's possible we're looking at an all-new system.

XeKr · February 28, 2016

I accidentally saved over my 75% Kaze branch of fate with a different Corrin/Kaze so I had to try something different. In Hoshido 6, Corrin with Dragonstone attacks the Fighter at 75%, then again with Azura's help (move so she doesn't give the +10 hit). Hinoka attacks the thief twice at 76% hit/3 crit, Kaze attacks thief twice at 74% hit/2 crit. 3 hits from Hinoka/Kaze kill the thief so some resets only have 5 data points instead of 6.

This is Lunatic, Classic, in the Japanese version. I think I got around 82% hits, no crits, in 174 data points. Still unclear, but maybe there is a new system...(though I'm not sure people did this kind of rigorous testing for Awakening)

75,H
75,H
76,M
76,H
74,H
74,H
75,H
75,H
76,H
76,H
74,H
75,H
75,H
76,H
76,H
74,H
75,M
75,H
76,H
76,H
74,H
75,H
75,H
76,H
76,H
74,H
75,H
75,H
76,M
76,H
74,H
74,H
75,H
75,H
76,H
76,H
74,H
75,M
75,M
76,H
76,H
74,H
75,H
75,H
76,M
76,H
74,H
74,H
75,H
75,H
76,H
76,H
74,M
74,M
75,H
75,H
76,H
76,H
74,H
75,H
75,H
76,M
76,H
74,M
74,H
75,H
75,H
76,H
76,H
74,H
75,H
75,H
76,H
76,M
74,H
74,H
75,H
75,H
76,M
76,H
74,H
74,H
75,H
75,H
76,H
76,H
74,H
75,M
75,H
76,H
76,H
74,H
75,H
75,M
76,H
76,H
74,H
75,H
75,H
76,M
76,M
74,M
74,H
75,H
75,H
76,H
76,H
74,H
75,H
75,M
76,H
76,M
74,H
74,H
75,H
75,M
76,H
76,H
74,H
75,M
75,H
76,M
76,H
74,M
74,H
75,H
75,H
76,M
76,H
74,H
74,H
75,H
75,H
76,H
76,H
74,H
75,M
75,H
76,H
76,H
74,H
75,H
75,H
76,H
76,H
74,H
75,H
75,H
76,H
76,M
74,H
74,M
75,H
75,H
76,H
76,H
74,H
75,H
75,M
76,H
76,M
74,H
74,H
75,H
75,H
76,M
76,H
74,H
74,H
75,H
75,H
76,H
76,H
74,H

Edited February 28, 2016 by XeKr

Crimean Archivist · February 28, 2016

Combining your data and mine, we now have over 300 data points at 75 Hit and exactly 498 in the 74-76 range with a weighted average of just above 75. In both data sets, both the 1-RN and the 2-RN expected values are outside the confidence interval. The Hit percentage for just 75 is right at 82.09%, and for 74-76 it is at 82.66%.

We're almost certainly looking at a new system, and I haven't the faintest clue what it is. I'm glad I've still been collecting data from various hit chances, but right now I still don't have anything substantial to go off of. Just keep trucking, I guess, and I'll be looking for patterns at other Hit values.

Moving forward from the little we know:

Assumptions:

The hit rates trend upwards for all values, e.g., the actual hit rate for 43 Hit will always be larger than the hit rate for 42 Hit.
All outcome ratios will eventually trend towards their actual values and can provide confidence intervals that narrow as the number of trials increases.

Postulates:

The upper bound for "True Hit" for any given value is its upper bound of the 99% confidence interval or the upper bound of the confidence interval of any number above it, whichever is lower. This will be useful for determining values of numbers with fewer data points relative to larger, more common numbers.
The lower bound for "True Hit" for any given value is its lower bound of the 99% confidence interval or the lower bound of the confidence interval for any number below it, whichever is higher. This will be useful for determining values of numbers with fewer data points relative to smaller, more common numbers.
At a very large number of trials (n>400), the marginal change in hit/miss ratio for additional trials is trivial and can be ignored, with the measured rate taken to be approximately equal to the "True Hit" value. So, until future measurements prove otherwise, I'm taking the "True Hit" value at 75 to be ~82. The ideal limit would be much higher but that would just be nightmarish to continue measuring nonstop.

All of these things can be determined quickly using Excel, so don't worry too much about the individual calculations; I've got that. However, it may take as many as four or five independent and well-spaced points to begin trying to fit models or functions to the data, so I'd start hunting for other locations to repeatedly test values like 25 and 50, but especially low values because values 85+ are actually pretty common in-game.

Edited February 29, 2016 by BlenD

eclipse · February 29, 2016

My guess is a weighted RNG (because Silas shouldn't have died to a 1% crit on multiple runs of the same chapter).

Can you keep track of not only hits/misses, but which side has the hits/misses?

Fates RN Regression Project

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members