So what I'd like to do today is to draw together some threads and consider how we use the data that we've obtained from our analysis.
To contribute to the interpretation of the archaeological context from which they were, those data were obtained.
And.
There's a number of steps in that process. We need to think about how we go about.
Assessing and assessing the sort of.
Validity of our data.
Let's talk about that a little bit already, but we need to make sure that we're presenting that in such a way that others can also judge.
The date the validity of our data. We need to remember that always that when we are dealing with instrumental techniques.
We're dealing with approximations.
And we're looking at an estimate of composition.
Whatever we do.
We also need to think about how we begin to explore that data, sort of by visualizing it in a variety of ways. By performing statistical analysis to understand.
And characterize variability within our data set and how we can begin to compare our data.
With data from other.
Previous analysis.
Some legacy data.
And above all, we need to think very carefully about how we integrate the results of our exploration.
Of material and data into an archaeological interpretation.
How we can?
Begin to reasonably.
Talk about the results of our analysis as a thread within a wider.
Stuff understanding.
Of the past.
And I think that's what we're all ultimately aiming for, as archaeological scientists or scientists or archaeologists who have a working understanding and knowledge of science have a sort of as informed consumers.
Of scientific data, we need to be thinking about these things. We need to remember thinking back to our early sessions in this course that what we're dealing with when we're looking at data are usually.
It's very rare for data to be presented as single analysis.
We're looking at the mean values from a series of replicates. All we're looking at the results of a continuous collection data collection over a period of time.
So the two options, perhaps we were dealing with a ES.
We might be looking at the results of three analysis replicant analysis on the same samples. If we're dealing with X RF, we might be looking at.
The results of counting.
Emissions over a period of perhaps 120 seconds, for example.
And.
We need to understand that when we were looking at that data, we need to think that.
If we have that we're looking at the mean values, then we also need to understand the standard deviation of those values, understand the.
Measure of instrumental precision, how?
How close to each other with the values we obtained when we measured it a number of times?
We also need to understand the limits of detection for that particular technique and instrument and sort of analytical environment that particular.
Our ability to distinguish the presence of a material above the background noise of our instrument.
We also need to understand.
The accuracy of results with reference with respect to standard reference materials and when we're presenting our data, we need to show all these things so other people can also judge and assess the reliability of our data.
So they can make decisions about how they're going to build it into wider comparisons, whether they can compare realistically with our data, the results that they've obtained.
And in what way?
And so in our table results we need to present.
Information about the limits of detection for analysis. At what level will be unable to detect the presence of a particular particular element, we need to.
Provide information about.
The accuracy of our analysis with respect to standard reference materials and we need to provide information about instrumental precision.
At least.
In the in terms of the variability of repeat runs of the same analysis.
So that people can judge so that we can judge as well.
At what level it becomes significant?
To interpret differences between samples.
There is of course the broader question of whether actually that is really what we should be interested in, and whether asked whether the question of variability across the sample.
Is equally important when we're thinking that we're trying to sort of reconstruct the composition of a material are we is the sort of instrumental precision always going to give us the best measure of variability across the sample? Evidently not always, but it's not always possible to do multiple. It's multiple analysis on a single sample.
And so we do. We have to recognize the limitations.
Of our analytical approaches, particularly with things like eye laser relation, I see PMS for example, where we're taking very, very small samples where variability between sampling sites is likely perhaps to be much greater than.
With a lot of more attempt technique which is taking more material and using that as the basis of our estimates of bulk composition?
We also need to think about and present information about in our reports in our in our papers about instrumental parameters, what was with the situation was the sort of conditions in which our analysts analysis we performed so that somebody theoretically could come along and follow the same steps could set up their their instrument to the same parameters, could follow the same dilution steps and procedures.
And.
Produce the same results.
Theoretically, although of course in archaeology, in honesty that very rarely happens.
It where it does. It's really important that we present and understand.
Present and present. Fully comparisons between different analytical approaches so that we can understand how.
Broader comparisons between datasets can be obtained can be done.
And so here's the kind of thing you might be presenting alongside.
Your tables results.
Something where you're showing. In this case the results of a what looks like a wavelength dispersive.
Analysis, you're stating the specific wavelength that you're using to quantify.
Your results.
And.
You're stating the limits of detection.
For those particular analysis, based on an assumption that you're dealing with a 10 milligram sample dissolved in 25 millimeter milliliters of solution.
And you are providing information about.
The mean values obtained.
From a series of analysis series of replicant analysis.
In this case, two analysis here 432 again on a series of.
Standard reference materials and you're providing them in comparison to the certified value so that you can see how close your analytical results are to those values.
So that someone coming to look at your analytical procedure.
Can understand.
More or less.
How close your results will be to the actual composition of the material that you're looking at.
So let's look at an example where we can explore the data a bit more fully.
And look at how we can begin to quantify the various aspects of accuracy and precision.
And here you are looking at a series of analysis carried out on certified reference material or standard reference material here with the certified values.
Out on the 1st row for the various elements and below a series of.
Analysis undertaken in the same lab with the same instruments over a period of time.
And we can approach this in a variety of ways.
Just look at the values for 10 as an example. Then we can treat all of our analysis as equivalent as as we can assume that all of those analysis are were conducted incomparable conditions.
And we can look at that whole spread of data as being.
A measure of the.
Accuracy and precision of our instrument as a whole.
And if we do that, we can express our data in a number of ways. We can look at the range of values for tin.
That we obtained and you can see that you're looking at a range of about one weight percent.
Between nought point 10.2 and 11.2, roughly speaking.
But we're looking at a distribution, and distributions are commonly or commonly assumed to be normal in as much as they have a Bell curve like shape.
And if we.
If we investigate our distribution in that way, we can get a better sense of its spread and it's Q. And so if we look at the the mean and standard deviation of the mean is our average value and the standard deviation is a measure of spread around the average value.
And you can see that you're looking at.
68%.
Of the data will be within one standard deviation plus or minus of the mean, so within within.
Around sort of between 11 and 10 point 10.4%.
We'll have 68% of our data is falling in that bracket. That's where that's telling us.
Another measure of this would be to look at the median or interquartile range animal sort media and interquartile range, so that's looking at the cumulative frequency of the data and you can see that the median gives us a slightly lower value for these sort of the midpoint of our data at 10.6 with an interquartile range of nought .4, which is telling us that 50% of our data is.
Within that bracket and we can plot this on the graph. So I was waving at the graph there without explaining what it was. So this is a plot of that data, so you have your full range. Here is whiskers on either side of the box plot, and the box is showing us the interquartile range, so the lower line is our first quartile, the 20th 25th Centile and the upper line is our third quartile, the 75th center percentile with the median in the middle, and that means that 50% of the data is below the mean.
And 50% of the data is above, so we can actually see from this plot that are mean, median is relatively low down, so we're looking at a slightly positively skewed distribution.
So more of our data is in this lower range. In fact, 25% of the data is between 10.6 and what 10.54 ish.
OK, so that gives us a way of understanding the spread of our data and this doesn't just apply as we'll see to individual analysis of an individual material now labs.
We can assess from this, again assuming that this is all undertaken, incomparable conditions, we can assess accuracy and precision. We can look at the accuracy in terms of relative error by calculating the absolute error, taking away the known value VE from R.
Obtain the value obtained in our analysis. The A and from that is taken as a mean value for all of our data. We can divide that by the known value to produce a relative percentage error.
And so we're looking at a relative percentage error of 1 plus just under 1% for the whole data set.
We can do the same thing by looking at the standard deviation. So if we take the standard deviation and divide it by the mean value for the data, then we get a relative measure of the variability. The repeat. Are we getting a measure of the repeatability of this by looking at the percentage?
Relative standard deviation.
We could have a look at this in a slightly different way.
And if we are working in a lab, or if we wanted to sort of examine the lab that we were thinking of using to do our analysis, it might be useful to know how reproducible.
Their data was, in other words, here, you're seeing measurements overtime. Actually, if we don't assume that these are exactly equivalent.
Then we can get a measure of the reproducibility of the data in that lab by taking the standard deviation of the relative standard deviations of.
Each of these events, which for which we internally assume there are standard conditions.
And by pooling those relative standard deviations to give us a sort of percentage relative percentage relative standardization of that pool value.
And that gives us a sense of the what's called the reproducibility. And we could do the same thing for individual operators as well if we wanted to compare.
By those statistics and we could do that in a number of ways in sort of in a similar way. Or we could look at statistically comparing for example, two different sets of data on the same materials. Perhaps we would like to compare to see if we have.
Similar values for between two different techniques in our lab, so we have the same materials we are analyzing in two different ways, and we want to see whether the accuracy of those results is compatible. Where we can do that by taking our data and comparing them statistically using, for example, at Test where we're comparing the mean values of the data and saying are these the same? Given the data that we have?
We might also want, for example, to compare.
The precision of our values, so we know that we're getting.
Relatively similar results in terms of the mean values for two different analysts, but it looks like one analyst might be a little less.
Robust in his approach or her approach. For example, where we could test that statistically by by applying what's called an F test and testing the variance of those data, comparing the variance of two sets of data on the same materials.
If we were looking at.
Another way to do that in terms of comparing the two sets of data together so for example, if we have.
Two sets of data on the same objects, and we wanted from different analysts or different techniques, and we wanted to see how well correlated those two are together. Then we could look for.
But look at plotting those and assess their correlation using a variety of different sort of methods of correlation analysis and produce a value for that correlation.
And again, these are producing. These tests are producing values of statistical reliability if you like, and usually what happens is we will set a particular level of significance abilo which we will reject the result and say that.
Our I sort of hypothesis that these two are different, or suggesting that these two are different is confirmed or rejected depending on the results. So if we set our statistical significance level, it's of 95%, then we have to be to have to demonstrate that there is a 95% probability or more.
That the two sets of data are the same.
If we have less than that, then we will reject it and assume and treat them as different and make appropriate changes to our.
Decisiones when we're doing the interpretations.
Another thing to consider if we think back to our data set before we would look notice there was several blank spaces in some of those columns. Well, that's because we were getting measurements which are actually below the limits of detection for our particular analysis. Now it's a really important thing to think about is what limits of detection means and how do we cope with that.
Because.
If we say that our limit protection is nought, point nought one and we get a measurement which is billiger limits.
Do we just report that as zero?
Is it the same? Evidently the answer is no.
So what do we report and?
There's a whole range of different.
Sort of approaches to doing that, so in many tables you'll see less than nought point nought. One or whatever, or happen limited addition happens to be, or just simply below limits of detection.
In other tables, what you might see is a stated limit of detection and then all of the.
Values which are reporting below that limit are given a standardized value.
Now, this is often half the time in detection or limited detection divided by.
Sqrt 2.
In some ways that's less important than treating them consistently and stating clearly what you're doing.
But these are accepted ways to actually approach this. There are other, more statistically complicated ways of dealing with this, where you are what's called Inputing the values from statistical ranges or assigning randomized values between particular range within particular ranges.
And again, as long as your convention, you take it 1 convention you stick to it and apply it consistently, then that's OK.
The reason for doing this is because some statistical methods, particularly things like.
Wait things techniques where you're involving a log transformation are unable to cope with zero values, so they can't cope with 0 being present, so.
This allows us to get around that issue, although it does have some complications.
When we presenting the data.
Our data we're going to want to, however, move away from just individual characterizing individual analysis or comparing the reliability of our samples to begin to understand how to deal with groups of data.
When we're looking at archaeological results.
It's pretty unlikely, or it may be not the case that we actually dealing with a single.
Tradition of production. A single source of raw materials, for example, or a.
Or a single type of material, so it might be looking at different colored glasses. We might be looking at.
Metals from different or sources within a region. We might be looking at different alloying practices and so forth, and in order to explore those we need to deal with sort of move beyond individual analysis and think about grouping analysis together and understanding groups in a variety of Ways. And we can do that archeologically we can assess and provide information about.
Based on archaeological knowledge we can group by.
I'm.
A particular class of object.
We can sit, we can say what is the range of variability? For example in tin 4.
Spearheads in this collection of data.
And we could give our results in a number of ways. We could look at just prevents presenting the mean and standard deviation. We could look at, provide the data and the means and standard deviations of those groups. Probably little bit better since it's allowing us to.
Just to allowing other users to actually use that data better. And indeed I would say that's always necessary to publish the raw data that we produce so that it can be used in different ways by other analysts.
But actually the numbers themselves are can be quite confusing and there certainly within archaeology what we're faced with is a range of different types of background of researcher, many of whom for many of whom actually, that's that's sort of scientists. Scientific kind of approach, or that data oriented approach may not be the most familiar, so numbers in this sense can be a little bit confusing and can be can be better expressed in visual terms.
So we need to think about how do we present our data graphically as well as numerically.
And there's a whole range of ways that we can do this, and this can really help us to actually understand the patterns present in our data.
So one of the most common is to use histograms, which are a way of.
Displaying numerical data.
In sort of in particular bands.
I'm sure you have come across these before.
And you can see here actually this is the plot of the data we just looked at earlier on showing us that we have this slightly positively skewed distribution here.
Of data.
410 in that amount those analysis of.
All.
Sort of.
Of that of that standard reference material, so you can see here that is showing us the same thing as our box and whisker plot that we saw earlier.
So if we look for an archaeological example here you can see a comparison of two different areas.
Mesopotamia proper and Luristan where you have very different patterns of tinus ort in the presence of tinnin those assemblies is very different.
So what you have in Mesopotamia is a largest number of objects of the analyzed set are showing that very low 10 levels. So you're looking at a sort of falloff here from less than 1% down to sort of 6, six, 7% in.
With a small
sort of 2nd peek here.
Perhaps around 12% and then a longer tail.
In Lewiston, Harvey, you've got very different pattern. You have a sort of normal distribution pretty much around 810% tin.
And most of the objects are showing that they have.
Ah.
Tin present in them in some in some way.
So the relatively few objects coming out as being less than.
1% tin
and that gives us an opportunity to interpret this, and we can interpret this in a variety of ways, but you could look at this at saying This is an area in which.
10 is being consistently added as part of the alloying process. This is an area where access to tin is relatively stable.
And the inclusion of tenen objects is quite consistent.
It's being added, You know, rough around sort of 6 to 10% with some variance around that that you would expect from a combination of different effects, including analytical errors.
In Mesopotamia, however, that does not seem to be the case, and the majority appears to be very lotin.
Now, the fact that there is some 10 in quite a lot of the objects here actually suggests that perhaps you're looking at multiple different kinds of patterns.
You might be looking at trade and exchange in int in bearing objects if you like and you might be looking at local production using copper from other sources where 10 is not common and perhaps some degree of mixing between them. So looking at secondary sort of re use of material and combinations with other materials. So you're looking at quite complicated process is that you can begin to unpick between regions at different times.
So if we wanted to compare different types of data, for example, the differences in composition of silver between coins found in two different hordes and different types of silver objects and coins within those hordes, we can do that with by means of, for example, a dot plots and this is a nice visual way of displaying the range of data that we're getting.
And you can see here that what you're looking at is.
Xref data, in this case from two different hordes. the West bag, bro horde, and the Hoxne hoard. Too late Roman coin Hoards, and.
I see PMS results from the West Borough Hall as well, so it's a combination of comparisons here. We're looking at both instrumental differences and typological differences here and into site differences. So complicated picture, but you can see hopefully that what you can begin to pull out our differences in the ranges of material reading horizontally across the plot. You can see that, well, there's a good deal of consistency across the whole range.
That
The results from Xref an ICP for the irregular coins seems to be very pretty consistent that the range of values for irregular coins at bag which are thought to be forgeries and get many case maybe, but maybe sort of unofficial minting rather than sort of what we might can traditionally think of as forgery is.
Actually the same, if not higher than regular coins from the border Hoxne. An interesting sort of discovery so we can begin to make these subsequent parisons.
So, but what do we do if we want to plot more variables together to compare the correlations between them? It becomes increasingly difficult.
Adding a third dimension is not terribly difficult. Of course, we can just plot in three dimensions, and there are plenty of graphing tools that can allow you to do that.
Although already you're having to step beyond perhaps standard.
Data processing tools, which are widely used like Excel and other spreadsheet tools you're having to step into new sort of statistical packages to be able to do this.
So here in three dimensions you can see that by taking our data of tin and zinc, which already shows quite a good separation, showing that high zinc material tends to be low in tin.
We can add an extra dimension to show that high zinc material is also very low inlet.
Sort of three points.
Congregation of information.
But once we get beyond that, we can't really plot in 4 dimensions or five or six or seven, and we have to begin to find mathematical ways of condensing that variation.
I mean the alternative, of course is to produce.
Matrix plots of all of our different variables and you know this can be a useful tool in sort of beginning the process of comparison.
It can be a useful way of spotting clear correlations between things, but it becomes very unwieldy very quickly and it becomes it becomes very difficult to do that sort of three or more points congregation of different variables on individual points.
So what's the alternative? Well, multivariate statistics multivariate statistics are designed to compress dimensions. They are tools for the reduction of multiple dimensions into.
So another approach would be to assume that there is structure in our data.
And to define.
That structure using a cluster analysis.
And a cluster analysis. Although there are variety of different sort of variant appan it.
In archaeology, commonly used is a hierarchical clustering.
Methods which effectively characterize Pairwise.
The similarity between each pair of between each pair of data objects.
And that's going to define.
Which two samples contain?
Two variables which are the most similar.
To each other.
And it will then form that the create that as a cluster and then iterate that process comparing all of the objects to each other and defining the next most similar pair.
Of data objects now ultimately it will start to link samples.
And pairs of samples which have already been clustered together into larger clusters.
And the result is a hierarchical kind of map of that process, which is called a dendrogram, and we can read that from the bottom upwards to say when individual clusters are formed. Which are the most two most similar to each other?
Or we can read it from the top down to see which to see broad groupings, which are similar to each other but different from.
The other groups OK?
If that makes sense.
So in this case, let's look at this as an example. We're looking at the same data as we just saw. You can see that we have.
One of our samples is clearly different from everything else. It's plotting its own separate user last blitz. The last thing to be classified.
And it is 1328, which if we think back to our principal component analysis was sitting way off.
To the North of the sort of far top left corner of our diagram there too, so it's identified by an outlier by that method as well, so no surprises there. If we see here, we've got our kilm jug and we've got a series of other samples which are linking together.
As.
The local group archeologically speaking.
Sort of making broad sense.
If we look further to the right, we have a broad group which is clustering together to some degree. Now, that doesn't make any shouldn't make us very surprised since as well we were seeing a broad and quite dispersed diffuse group which was over, which was sort of encompassing both the important material identified from the sites and known material from other sites.
So that's basically showing us exactly the same information.
It's identifying that here you have group for sample 89 here, which is the Cypriot material. Is clustering slightly away from their main group's main broad group here, which is all very closely similar to each other.
So that's I mean, that's basically how we would approach this and we can. We can break this down further, but we can only really begin to interpret this by going back into the archaeological record.
Now, with a hierarchical clustering system, we're not telling.
We're not defining in advance how many groups will be found.
But there are methods of clustering analysis which will enable us to do that, like K means clustering. But that's a talk for another day, I suspect.
Another approach.
Which also assumes that there are categories within our data.
Is to apply a discriminant analysis.
And as discriminant analysis can.
Be used with two or more groups.
And is based on a similar principle in some respects to principle component analysis.
But whereas principle component analysis was attempting to find.
The maximum variance.
Between across all of the datasets to find sort of vectors with a maximum variance across all of the data set through those different variables.
Discriminant analysis will attempt to find.
Axes which are.
Maximising the difference between.
The centroids the sort of central point. If you like the the average value of our groups, you could conceive it as.
And also minimizing the spread of those different groups.
And this will effectively enhance the separation, at least theoretically, enhance the separation of our groups on our.
Sort of two dimensional dimension, reduced plot.
And so it'll look like something something a little bit like this, where you have.
Your group centroids here.
Separating out quite nicely from each other so that you're getting clear divisions between them. Now in this case, there's some still overlap in the middle, but when compared with the principle component analysis of the same data, it may be that you're getting better separation, so this is just another way of doing that.
It what can then be used is if you have a set of data based on known characteristics, then new data can actually be unknown. Data can be applied.
To that, and assigned two groups probabilistically, and. That means you can effectively.
Approach this is a sort of. It can be used in situations where you're doing machine learning to identify groups of material.
So it has quite. It has quite a lot of power behind it, but it's a complicated process and one which requires a little bit more sort of investment.
Since it requires clearly defined groups to begin with in order to work well.
You can look into some of the details of these techniques are in a variety of different publications. There are some very helpful I would say.
Resource is online at the moment, and of course once the libraries become accessible, you'll be able to look some of these up in books and I would recommend investing if you're going to continue to do any form of analysis investing in some basic textbooks to help you to understand these techniques in more detail.
But ultimately, the best way to use them, and to learn how their limitations is to do so in practice. And I've put up a few different examples of different approaches to the processing of data online for you to work through.