Comparing different machine‐learning techniques to date Nile Delta sediments based on portable X‐ray fluorescence data

Geomorphology generally aims to describe and investigate the processes that lead to the formation of landscapes, while geochronology is needed to detect their timing and duration. Due to restrictions on exporting geological samples from Egypt, modern geoscientific studies in the Nile Delta lack the possibility of dating the investigated sediments and geological features by standard techniques such as OSL or AMS 14C; therefore, this study aims to validate a new approach using machine‐learning algorithms on portable X‐ray fluorescence (pXRF) data. Archaeologically dated sediments from the archaeological excavations of Buto (Tell el‐Fara'in; on‐site) that pXRF analyses have geochemically characterized serve as training data for running and comparing Neural Nets, Random Forests, and single‐decision trees. The established pXRF fingerprints are transferred via machine‐learning algorithms to set up a chronology for undated sediments from sediment cores (i.e., the test data) of the nearby surroundings (off‐site). Neural Nets and Random Forests work fine in dating sediments and deliver the best classification results compared with single‐decision trees, which struggle with outliers and tend to overfit the training data. Furthermore, Random Forests can be modeled faster and are easier to understand than the complex, less transparent Neural Nets. Therefore, Random Forests provide the best algorithm for studies like this. Furthermore, river features east of Kom el‐Gir are dated to pre‐Ptolemaic times (before 332 B.C.) when Kom el‐Gir had possibly not yet been settled. The research in this paper shows the success of close interactions from various scientific disciplines (Geoinformatics, Physical Geography, Archaeology, Ancient History) to decipher landscape evolution in the long‐term‐settled Nile Delta's environs using machine learning. With the approach's design and the possibility of integrating many other geographical/sedimentological methods, this study demonstrates the potential of the methodological approach to be applied in other geoscientific fields.


| INTRODUCTION
Most geoscientific studies do not just focus on the processes during the evolution of sedimentary archives. Besides this, datings are needed to decipher the initial point and the speed of those processes.
In highly dynamic environments like the Nile Delta, dating is essential. Unfortunately, no laboratories for the most common approaches, such as 14 C-AMS and OSL, exist in Egypt, and sample export is forbidden.
The Nile Delta acted as a living place and an economic hotspot in the Eastern Mediterranean world throughout the millennia. Most parts of Egypt are life-threatening deserts, and only a tiny strip along the Nile River, the Faiyum Oasis, and the fertile Nile Delta is permanently habitable. However, the first archaeological remains to prove human occupation in the Nile Delta date back 6500 years B.P. As deltas are highly mobile landscape features, mainly formed by water and sediment supply from the river's drainage area, post-depositional sediment compaction, sea currents, tides, sea-level fluctuations, and finally, human impact, they underwent significant modifications over time.
While gaining geochronological data is problematic, archaeological data are rich in Egypt. Archaeological features can be dated to a specific era (Ptolemaic, Roman Imperial, etc.) by their construction style (buildings, roads, harbor infrastructure, etc.), their material culture (ceramics, consumables, weapons, etc.), or by interpretation of shown symbols (coins, heraldry, etc.) and therefore bear information of the depositional context. Typological dating is based on professional archaeological expertise that can be performed directly in the field (Auriemma & Solinas, 2009;Gowlett, 2006;Martini & Sibilia, 2001;Seeliger et al., 2014).
Furthermore, in the absence of laboratory analyses, sedimentary environments can be characterized using portable X-ray fluorescence spectroscopy (pXRF) to investigate elemental compositions in different settings (Altmeyer et al., 2021;Ginau et al., 2020;Lubos et al., 2016;Pint et al., 2015). Combining the geochemical characteristics of archaeologically dated and-so far-undated sediment deposits by machine-learning (ML) algorithms allows for setting up a chronology for these environments. Based on this assumption, Ginau et al. (2020) conducted a pilot study and showed the functionality and validity of this approach for the study area around Buto (Tell el-Fara'in; Figure 1). They also tested the accuracy of this new approach by comparing the gained ages with dated ceramics and radiocarbon ages, sampled before the export ban was active. So far, unfortunately, no other studies have used a similar approach based on such a substantial geochemical data set on a comparable large study area or a further deltaic landscape (Ginau et al., 2020).
Therefore, here, we aimed to (1) apply three different ML approaches, Neural Net (NNet), Random Forest (RF), and singledecision tree (C5.0), to check if archaeological age information (onsite) can be transferred to sediments far off the settlement mounds (off-site) using pXRF data, (2) compare all approaches and evaluate if easily anticipated C5.0 and RF show similar results as the "black-box system" NNet (Lantz, 2019), and finally, (3) furnish a chronostratigraphic framework for the sediment cores described by Altmeyer et al. (2021) for the channels adjacent to Kom el-Gir.

| STUDY AREA
The study area is situated in the northwestern part of the Nile Delta, south of Lake Burullus, east of the Rosette River Branch, and northwest of the most significant modern city in that part of the Delta, Kafr El Sheikh ( Figure 1).

| General setting
In general, the landscape of the Nile Delta is very flat with low elevation differences that evolved over the millennia due to an interplay of fluvial and marine processes. Only a surplus of sediment delivered by the Nile led to its progradation. Tides as well as strong longshore currents erode freshly delivered sediment and therefore suppress delta progradation. The same holds true for fluctuating sea levels, which favored or hindered delta's seaward growth. Thus, the delta evolution that resulted in the modern outlook of the Nile Delta started when sea-level rise slowed (6 kyrs B.P.) (Elfadaly et al., 2019;Kelletat, 2013;Lambeck & Purcell, 2005). Branch that flows through the eastern part into the Mediterranean Sea (Ginau et al., 2020;Pennington et al., 2017). Having a drainage area of approx. 3.3 million km 2 and a total length of roughly 7000 km, the Nile is the world's longest river (Fielding et al., 2017;Garzanti et al., 2015;Liu et al., 2009). Due to the large drainage area, stretching over more than 30°in latitude sourced by the White Nile from Uganda and the Blue Nile and Atbara River from Ethiopia, the Nile River is governed by precipitation regimes in several regions.

| Hydrology and geology
What holds true for the precipitation is also valid for the basement rocks eroded in the different catchments. The Atbara River and the Blue Nile drain Cenozoic basalts of the Ethiopian Highlands. At the same time, the White Nile erodes Archaean-Proterozoic rocks of the old Congo Craton, extending through the Precambrian rocks of the Saharan Metacraton. (Box et al., 2011;Krom et al., 2002;Ménot et al., 2020;Revel et al., 2010;Williams, 2009;Woodward et al., 2015). The annual Nile flood between July and October is responsible for the fertility of the Nile valley and Nile Delta. This millennia-old process was stopped with the Aswan High Dam's construction in the 1960s. Since then, the amount of sediment reaching the Delta and the Mediterranean Sea has dropped to zero, causing numerous severe problems for the Nile Delta population. Furthermore, due to global warming, the rising sea level degrades the delta front, and salty marine water intrudes into the ground-water body, preventing the agricultural use of this now brackish water. Natural sediment compaction and oil and gas extraction in the Nile Delta region led to high subsidence rates, making the coastal strips more prone to degradation by the sea (Marriner et al., 2012;Stanley & Warne, 1998;Syvitski et al., 2009). Since the time of the Eo-Nile in Tertiary times, sediments transported by the Nile accumulated in thick packages within the area of the modern Nile Delta. In this study, only the uppermost layer of this sedimentary body deposited during Holocene times is investigated (Wunderlich, 1989).

| Settlement history
The annual Nile flood was both a boon and a bane for people living in the valley and the Nile Delta. In ancient times, the yearly flood delivered sediments and fertilized the cropland but also posed the risk of flooding settlements. To avoid this risk, natural levees along the Nile branches and so-called geziras, sand mounds rising just a few meters above the surrounding cropland, were used as preferred settlement areas in the Nile Delta. The ancient settlements grew during the millennia and today form so-called "koms" and "tells" overtopping the cultivated area. In Antiquity, their elevated position offered flood protection compared to the flat fertile land (Butzer, 2002;Schiestl, 2021). Therefore, modern researchers focusing on landscape reconstruction and ancient settlement patterns often design their study along river branches ( Figure 2a   connection to waterways (Altmeyer et al., 2021;Ginau et al., 2017Ginau et al., , 2019Schiestl, 2021  Additionally, the ERT results reveal sediments identified as inactive channels and associated channel elements. Finally, the correlation of the entire data set, further literature, and topographic maps enabled us to reconstruct an ancient channel system near the kom (Figure 3a). To sum up, the results reveal clear evidence of a former channel system within the study area but-so far-of unknown age (Altmeyer et al., 2021). Therefore, data of the pXRF measurements around Kom el-Gir are also used in this paper to add age estimations to the cores' stratigraphy. Ginau et al. (2020) present the first results of establishing an NNet based on pXRF data from Buto to transfer archaeological age estimations from the tell to the surroundings. Furthermore, they prove this approach's validity and statistical correctness. Therefore, this paper focuses on applying C5.0s and RFs to the same data set and comparing their results with the already existing Neural Net.

| ML-A short overview
The overall principle of ML is based on algorithms that automatically gain accuracy by experience. There was a need to conduct tasks by computers without programming an explicit tool. Therefore, several approaches were grouped under the term ML to teach the computer to fulfill operations for which no entirely satisfactory program or algorithm existed before or that are too complicated or timeconsuming to be programmed by humans (Sheppard, 2019;Taylor, 2017). Every approach of ML can be grouped into popular "learning styles": supervised, unsupervised, and semi-supervised. RF and C5.0 belong to the supervised learning styles. RFs are an enhanced version of the decision trees using bagging (Hänsch & Hellwich, 2017) and are listed under the ensemble methods ( Figure 4). In contrast, NNets are more complicated, include more statistics, and do not use explicitly transparent procedures. As NNets are based on training data that create the model to predict results for the test data, they are handled as a semi-supervised learning style (Smith, 2017;Vemuri, 2020;Wu et al., 2008).
In summary, supervised learning approaches present the algorithm with a training data set and the desired results (here, pXRF data, 28 elements per sample) and a historical period (here, Roman, Dynastic, Predynastic, and Nile). The aim is to discover a rule that allows the computer to create the desired results and establish a model. In a further step, this model predicts the results (the historical period) for undated test-data samples where only the geochemical pXRF fingerprint is known. NNets are often promoted as the algorithm of choice for most questions but are very computationally intensive. Therefore, RFs and C5.0s are easily understood and can be quickly calculated even on regular desktop computers ( Figure 4).

| Single-decision trees (C5.0)
C5.0 is an extension of the C4.5 algorithm with a long tradition in classifying large data sets. As the main improvement, C5.0 shows higher efficiency and uses fewer computer resources than C4.5. The C5.0 model splits the data based on the attribute that provides the maximum information gain (Mingers, 1989), creating binary trees (just two sub-data sets per node). Tree pruning is performed using the Binomial Confidence Limit. At first, a large tree is built that fits the training data very closely. Later, it is pruned by deleting branches that are seen to have a high error rate. Finally, the tree's performance as a whole is checked (Barros et al., 2012;Esposito et al., 1997;Sharma et al., 2013).

| RF
The general concept of RF (or decision tree forest) is inspired by nature. In a natural forest, a wide range of trees is found, from juvenile to very old, very high and small, and robust and weak ones.
The mixture of many different trees makes the forest strong.
Therefore, computer science tried to adopt this approach. An RF is a bunch of decision trees built on random subsets of the original training data, and the decision of the entire RF is based on the results of each tree. Therefore, RFs are a typical example of bagging (Hänsch & Hellwich, 2017). The two most important advantages of RFs are (i) its fast computation, process as it uses just a subset of the data, and (ii) its way of calculating manifold random trees so that individual, less probable decisions-very likely caused by outliers-do not significantly affect the overall result (Breiman, 2001;Hengl et al., 2018;James et al., 2013;Lovelace et al., 2019;Mingers, 1989;Zhu, 2020).
Here, we use the "randomForest" package in R. Several tests are required to detect the best configuration of the hyperparameter with the lowest out-of-bag (OOB) error. The OOB error or Misclassification Rate (stated in %) describes how well a bagged trained decision tree predicts. While growing the trees, a definite number of the training data-here 2/3-is used. The OOB test takes the unused 1/3 of the data and checks if the established trees correctly predict these data. The resulting OOB error (the lower, the better) is a valid estimate of the test error for a bagged model (Genuer & Poggi, 2020;Lovelace et al., 2019;Sheppard, 2019;Smith, 2017). However, finally, tuning an RF to end up with an ideally low OOB error remains very time-consuming work (Lantz, 2019).
The two most important hyperparameters of an RF are the number of grown trees (ntree) and the number of attributes used at | 63 each tree (mtry). Both lead to a decrease in the OOB error with a rising value. Plotting the OOB error against ntree helps detect the optimal number of grown trees ( Figure 5). Following this, the OOB error reaches its minimum at around 30 trees. In addition, we calculated different combinations of ntree and mtry to detect the optimal number of attributes (Table 1). As a combination of Furthermore, we tuned the RF utilizing the MeanDecreaseGini (MDG) and the MeanDecreaseAccuracy (MDA). We aimed to detect the most influential attributes while establishing RF and reduced the data set to enable an even faster calculation. For the MDA, the 10 most influential attributes (elements S, P, Ca, K, Pb, Sr, Si, Fe, Cl, and Ba) were selected, while for the MDG, the 9 most characteristic attributes (elements P, K, Sr, Mn, Ca, Fe, Pb, Zn, and S) were chosen using the "importance" function of the "randomForest" package in R ( Figure 6).
Especially for the MDG, it is interesting to note that there is good accordance with Factor 5 of the factor analysis presented by Ginau et al. (2020), stating the human influence. Pruning of the trees in RFs was not performed as maxnodes, which define the maximum amount of leaves in R, were set to default. Furthermore, the replacement rule of the algorithm is used so that each new tree can subsample its data out of the entire data set. Besides, the final decision's voting mechanism was also set to the R-package default.
Each tree of RF states a clear vote with either 100% for Roman, Dynastic, Predynastic, or Nile. In the final step, the historical period with the highest overall number of votes from RF trees serves as the basis of the entire RF decision (Berk, 2020).
Based on this, a model for the MDA, including those 10 elements and the MDG with the mentioned 9 elements, was created using ntree = 1024 and mtry = 3. Furthermore, RFs using mtry = 8 while

| Neural Net
The model of NNet consists of neurons that can address information from external input or other neurons and forward it modified to other neurons or output it as a result. Input neurons receive data or information as signals from outside the network. Hidden neurons are between the input and output neurons and are responsible for the calculations. Output neurons finally hand the information out of the network again (Figure 7). The different neurons are interconnected with each other by edges. This allows one neuron's output to become the next neuron's input. To avoid chaos, each neuron is only connected with all neurons from the next layer but not with its own layer's neurons (Lantz, 2019;Oonk & Spijker, 2015;Rashid, 2017;Taylor, 2017). Depending on the strength and importance of the connection, the edge has a flexible weighting. The stronger the weighting, the more significant influence a neuron can exert on another neuron via the link. Positive and negative weights exist, representing excitatory or inhibitory influence. If the weighting is zero, a neuron does not affect the next neuron via the edge. The knowledge and thus the artificial intelligence of a neural network are stored in the edges and their weights. The number of neurons and F I G U R E 7 The Neural Net configuration used in this paper is similar to Ginau et al. (2020) but, in contrast, with a freshly trained set of imputed training data. The input layer contains the training data's 28 elements, followed by two hidden layers with 16 and 5 neurons. Finally, the three historical periods and Nile act as the output layer (own compilation based on Ginau et al., 2020). [Color figure can be viewed at wileyonlinelibrary.com] neuron layers and the connection possibilities of the neurons of different layers determine the neural network's complexity (depth) and its ability to solve problems. During neural network training, the connections change weights depending on the applied learning rules and achieved results (Lantz, 2019;Oonk & Spijker, 2015;Rashid, 2017;Reimann et al., 2008;Taylor, 2017).
Here, we used a neural net based on the R-package "neuralnet." The input layer consisted of the values from the 28 attributes of the training data set. Two hidden layers of 16 and 5 neurons were added, while the output layer includes the three periods Predynastic, Dynastic, Roman and the non-historical group Nile.
Unfortunately, no one-fits-all rule to determine the number of hidden layers and their number of neurons exists. The number depends on the number of input nodes, the size of training data, the amount of poor or noisy data, the grade of the complexity of the task that the neural net should perform, and the available computer performance. Reducing the number of neurons makes sense, so it shrinks one by one with each new hidden layer added.
Finally, the best balance between accuracy and time the network needs to be trained is to be detected. Furthermore, the "black-box" character of the neural net (Lantz, 2019) makes it difficult to understand what exactly happens at each neuron and what finally results in the final weight of each edge, which is the most significant disadvantage of this algorithm (Ginau et al., 2020;Oonk & Spijker, 2015;Rashid, 2017).

| RESULTS
The ML approaches described in Section 3 were applied to sediments of 41 corings, of which four are presented in detail here. First, we present corings G8 and G9 from the Buto area, where several diagnostic ceramics act as a chronological crosscheck. Furthermore, we show corings M005 and M006 situated north of Kom el-Gir, where detailed interpretations of sedimentary units established by Altmeyer et al. (2021) are available (Figures 2 and 3).

| Corings G8 and G9 in Buto
Corings G8 and G9 form a transect that stretches at a right angle from the tell border in its northwestern part. G8 is just 9 m distance to the tell border and 43 m distance to G9, which is 52 m far from the tell (Figures 3 and 8).

| Corings M005 and M006 close to Kom el-Gir
Both corings M005 and M006 are located north of Kom el-Gir (Figures 3 and 9). As mentioned in Section 2.3.2, Kom el-Gir was only settled in Ptolemaic and Roman times. In our ML approaches, both periods are named Roman. Therefore, we only expect Nile and Roman votes for the corings from that region. In Antiquity, rubbish material of broken ceramics and bricks was frequently used to consolidate swampy ground (Pint et al., 2015;Seeliger et al., 2013Seeliger et al., , 2019. This might also have been the case here. | 69 an even thinner Roman layer of just 2 m on top of a massive Nile package. As this coring location is far from the kom, this is not surprising. In remarkable concordance to M005, the observed riverine impact on the landscape occurred before the occupation in Roman times but probably lasted longer at M006.

| DISCUSSION
Based on the results, the three main aims of this paper are targeted.

| Transfer of archaeological age information from the tell (on-site) to the surroundings (off-site)
In general, the research approach mentioned in chapters 1 and 3, based on the training data gained at Buto, delivered satisfactory results to date sediment layers in the environs of Buto and Kom el-Gir. All presented corings show the general trend of the expected sequence of Nile-Predynastic-Dynastic-Roman (for Buto) and Nile-Roman (for Kom el-Gir). Overall, the results match the age estimations of the diagnostic ceramics for corings from Buto ( Figure 8). Therefore, all approaches-even the C5.0s with reservations-can generally be used to yield ages for the so far undated sediment cores around the tells. However, as the training data were established using sediment samples from Buto, it was unclear if they also work for Kom el-Gir as the archaeological layers of both tells differ. Therefore, setting specific training data for different tells is worth testing but practically tricky due to limited time and resources in the short periods of fieldwork. However, looking at the Kom el-Gir corings (Figure 9), it is clear that the training data gained at Buto also predict those corings' sediments well.
Nevertheless, three points of the general approach should be discussed.

| Elementary composition and temporal resolution
The main components of all samples, the training, and the test data show high concentrations of the Nile mud's typical elements like Al, Ca, and Si and remarkably lower amounts of cultural elements like Pb and Cu used to determine the different historical periods (Figure 2b; Ginau et al., 2020). Due to the tiny amount of cultural elements compared with the high amount of natural Nile mud's elements, it is questionable if the Nile Delta's sediments would allow an even more satisfactory historic resolution. Splitting Roman used here into its sub-periods might be problematic, and the aim should be to yield a confidence interval similar to 14 C (Geyh, 2005;Reimer et al., 2020).
Besides that, the amount of training data sampled for those subperiods counts here. The number of samples per historical period is not large enough to allow a more suitable subdivision. Therefore, they are grouped rough into only three historical periods (Figure 2b).
Running the models with less training data for each period will result in weaker models and more errors in the predictions (Ginau et al., 2020).
Furthermore, trusting the imputation cannot act as an answer alone as this technique may cause errors when applied too excessively on a too-small data set (Aitchison, 1982;Morita, 2021;Reimann et al., 2008;Schober & Vetter, 2020). Therefore, the only option is to sample more training data from archaeological sequences to strengthen the training data to fix this issue. Lastly, the significant settlement hiatus at Buto (about 1500 years; Figure 2b) is still problematic as a considerable period is missing in the training data.
Therefore, further tells nearby must be integrated into this study to fill the temporal gaps in the training data.

| Outreach of the training data
Our approach allows dating sediments of Kom el-Gir that are in short distance to Buto, where the training data were obtained (Figures 2   and 3). Nevertheless, it remains unclear if the training data set of Buto will also work at far-distance locations in the Nile Delta, as the sediment composition should differ a lot when coming to other parts of the Nile Delta ( Figure 1). Furthermore, the historical development is not homogeneous over the entire delta. Therefore, it is not to be expected that the training data of Buto can be applied to places far away.

| Riverine landscapes
Furthermore, it is questionable if the approach works for riverine landscape features. River branches mix up the transported sediments and may deposit them randomly. They even erode material further upstream and deposit it at locations far from its origin. Therefore, the sedimentary filling of former river courses, for example, at M005 and M006, might only be useable after the energy has left the system and the river branch has turned into an oxbow or billabong. To sum up, the used approach, comparable to 14 C, shows problems in dating relocated sediments.

| Performance of the different ML approaches
NNet is a potent tool often used in a wide range of applications.
Nevertheless, it has some weaknesses, mainly in its "black box character" (Lantz, 2019), speed, and needed computing capacity.
While RFs and C5.0s just took a few minutes each to be calculated on a regular office computer (Intel Core I7-3770@3.40 GHz with 16 GB memory), NNet took 3 days for this process. It is also not visible to the user how NNet exactly works, such as how many backpropagation steps were performed while establishing the full NNet model. Therefore, those two disadvantages argue for RF and C5.0, where the entire working process is understandable. It became visible at each node, what happens there, and the purity or information gain's general concept that triggers division at the nodes is easy to understand. Therefore, the idea of RFs and decision trees provides a time-efficient and easy-to-handle approach, which is particularly useful for our study. Figure 10 compares RFs/C5.0 configurations and NNet (100%). The higher the percentage, the closer the RFs/C5.0 and NNet results. C5.0s show agreements of only around 50% with NNet, while RFs yield around 75% (except the deliberately weakly configurated RF10 with 70%).
Furthermore, some general statements on which algorithm works best can be made.

| Detecting the onset of human impact
There is a general trend in the majority of our corings that RFs and NNet show similar results. Both are especially good in separating the natural Nile material and the group of the cultural layers. This division is helpful when dealing with archaeological questions to determine the onset of human impact. Additionally, corings close to the tells are tendentially more diverse in voting. This is not surprising as the tells act as the primary source, and the immediate surroundings show most human activities.

| RF configurations
Almost all RF configurations show more or less consistent results.
RF10 (about 70% agreement with NNet) is not as strong as the other RFs (about 75% agreement with NNet). However, the difference between both is not as strong as expected (just about 5%; Figure 10).
The reduction to the most informative 10 resp. 9 elements also reduces the mtry, again making the calculation process faster (Supporting Information: Appendix 2). As this reduction in elements is not detrimental to the prediction accuracy, it represents an excellent way to make the calculations faster. RF MDG even reaches the highest agreement with NNet ( Figure 10). Generally, all RFs are strong in predicting Roman, which makes sense as Roman delivers strong elementary signals (Ginau et al., 2020) and plays a decisive role in each tree's voting procedure inside the entire RF. With minor exceptions of RF10, all RFs in this paper yield good results compared with NNet, as the bagging process prevents RFs from overinterpreting outliers. Especially in coring M006 (Figure 9b), different predictions were made compared with RFs and NNet. As a tendency, both C5.0s predict Roman too often, mostly where other algorithms vote for Nile. This is also seen in the confusion matrices of both approaches (Supporting Information: Appendix 1). If the algorithm cannot separate Roman and Nile in the training data, it cannot be that strong when voting for them in the test data. Although this problem in the training data became better due to boosting in C5.0 Bo3, this misclassification could not be solved. A further point that is a limitation in both C5.0s-in contrast to RFs-is overfitting and the weakness against outliers. In C5.0s, each training data line is used and represented in the resulting tree structure. A few outliers, maybe measurement errors or contaminated samples, may result in weak leaves in the tree and the overinterpretation of these errors. Nevertheless, both C5.0s also, in some cases, deliver valuable and congruent votes compared with NNet and RFs for cores G8 and G9 ( Figure 8).
Overall, this study shows that NNet, due to its complexity of connection possibilities, is not universally the best algorithm for predicting results (i.e., the historical period) for test data. All RF  (Figures 2b, 3a). This seems plausible as, historically, the regional settlement growth in the Graeco-Roman period has been largely explained by the benefits of a functioning network of waterways. Hopefully, further studies can clarify these results. Therefore, interpreting further corings in the way that Altmeyer et al. (2021)  NNet and RFs deliver good results to transfer the elementary fingerprint of dated archaeological sediments of the tells to undated sediments of the nearby farmland. NNet and RF revealed the most robust classification results of the three algorithms used here. In contrast, the C5.0 algorithm showed substantial deficits, mainly caused by overfitting and the inclusion of outliers, and poor data. As NNet, compared with RFs, needs more computer capacity and suffers from intransparency during the calculation process, RF is the best choice among the three algorithms used here. RF shows an advantage as it excludes outliers and is easily understandable and can be quickly calculated even on regular office computers.
Therefore, the concept of RF provides a time-efficient and easy-tohandle approach, particularly useful for applicational studies such as geoarchaeological ones.
Furthermore, a chronostratigraphy for the sediment cores of Altmeyer et al. (2021) was established and solved the question of when the described riverine features were active in the past. When dating more corings in the surroundings using our approach, both the timing and processes of the deltaic landscape formation can be unraveled. Nevertheless, some questions and challenges remain to be addressed in future research. Every study area ideally needs its own training data. As sampling the training data is the most timeconsuming process in this approach, this is the critical point when applying it to other study areas worldwide. For example, the training data of Buto worked for the nearby Kom el-Gir, but these data cannot be adopted one to one to other parts of the Nile Delta or even other deltas worldwide.
Overall, this study shows that the chosen approach is valid for deltaic environments like the Nile Delta. For example, no similar studies in limnic or aeolian depositional contexts have been conducted to date. Finally, future work needs to be done to refine the historical resolution of the training data so that not only three quite rough historical periods can be detected by the ML approaches.
To sum up, this study acts as a practical example of interdisciplinary work (Geoinformatics, Geography, Archaeology, History) looking to resolve upcoming questions about dating the Nile Delta's sediments and help decipher the evolution of this vital landscape settled for many millennia.

ACKNOWLEDGMENTS
The fieldwork was performed as part of the regional survey around Buto (Tell el-Fara'in), with a particular focus on Kom el-Gir. This project was undertaken under the umbrella of the Cairo Department of the German Archaeological Institute (DAI). Thanks are due to the entire DAI team, especially the former first director of the DAI Cairo Department, Stephan Seidlmayer, for his hospitality and support during all field campaigns. We thank the Egyptian Ministry of State for Antiquities for their assistance and support of our work in the field and for granting the research permits. Furthermore, we thank the two anonymous reviewers for their valuable comments. Finally, MS wishes to thank the team of the University of Applied Sciences Mainz, Germany, for many insights into geoinformatics and coding that he received while pursuing a part-time master's program between 2019 and 2021 that provided the impetus for this paper.