How do we avoid imposing the present on the past when modelling spatial interactions|

Theoretical archaeological modelling for describing spatial interactions often adopts contemporary socioeconomic ideas whose 20th-century language gets translated into historical behaviour with the simplest of lexicons. This can lead to the impression that the past is like the present. Our intention in this paper is that, when this happens, we strip out as much of the contemporary context as we can, to bring modelling back to basic epistemic propositions. We suggest that although the underlying ontology may be specific to contemporary society the epistemology has much greater generality, leading to essentially the same conclusions without the carapace of intricate economics. IZVLE∞EK – Teoreti≠no arheolo∏ko modeliranje za opis prostorskih interakcij pogosto sprejema sodobne dru∫benoekonomske ideje, ki jih iz njihovega jezika 20. stoletja prevajajo v zgodovinska vedenja le z najpreprostej∏im besedi∏≠em. To daje vtis, da je preteklost podobna sedanjosti. Na∏ namen v tem prispevku je zaznati ta pojav in ga v najbolj∏i meri o≠istiti sodobnega konteksta, da lahko modeliranje vrnemo k osnovnim epistemolo∏kim nastavkom. Predlagamo, da ima epistemologija ve≠jo generalizacijo, ≠eprav je osnovna ontologija zna≠ilna za sodobno dru∫bo, kar vodi v bistvu do enakih zaklju≠kov brez ogrodja zapletene ekonomije.

The sentiment underlying this quote is arguably even more valid when it comes to proto-historic and prehistoric archaeology, for which we have even less data than for other periods. However, we shall show below that, rather than seeing this as a cause for concern, some archaeologists have proposed that the projection of contemporary society onto the past provides a new way to proceed. In this they directly repurpose economic models describing the attributes of today's society with a perceived historic counter-part (e.g., urbanisation, mobility, migration, trade) into historic contexts. A change of vocabulary is necessary, but the lexicon is often very transparent.
There is one immediate issue with this temporal and linguistic translation, even when it is appropriate. The 20 th -century modelling that we mimic has its own narrative formulation which may, or may not, be realistic. In the worst case we are not only putting an ancient gloss on a contemporary narrative, but that narrative may itself be unreliable. The contemporary models that we shall discuss below have their origins firmly in free-market economics. To quote the free-market economist Milton Friedman (1953.14- (Eaton, Kortum 2002;Anderson, van Windcoop 2003). ii) Settlement structure in central Anatolia and the Khabur Triangle in the Middle Bronze Age (Davis et al. 2014;Palmisano, Altaweel (2015)) -explored in the framework of the 20 th -century methodology of dynamic 'Retail Modelling' (Harris, Wilson 1978).
These models were chosen because they arguably provide the most prescriptive application of freemarket 20 th -century economic theory to the historic past, although in very different ways.
The former is the simplest conceptually, almost a straight transposition of the behaviour of contemporary traders onto that of their antecedents. Other authors have noted the existence of free-market behaviour in historical systems, particularly the Roman Empire, e.g., see Tom Brughmans and Jeroen Poblome (2016) and Xavier Rubio-Campillo et al. (2017), but what singles out the paper by Gojko Barjamovic et al. (2017) is the very heavy economic machinery (e.g., 'Constant Elasticity of Substitution', 'Weibull cost distributions') which, as we shall discuss later, they bring to make their case.
The second model is more subtle, concerned with the building of centres of influence from local communities, seeing synoikism in analogy with the creation of shopping centres. This is not to say that citystate formation is predominantly an economic activity. The advantages of close interaction are as much social and 'political' as economic. Unlike the case of traders, the parallels are structural in mimicking the way that dominant centres form but are not assumed to be a direct translation. Nonetheless, the method of solution is taken directly from the retail picture.
However, we shall see how both cases permit a 'maximum entropy' ('MaxEnt') representation (Jaynes 1957;1979). This enables us to recast the models in such a way as to de-emphasize the projection of freemarket narratives on the past, by looking for the 'least surprising' outcomes commensurate with our limited knowledge. Although the authors of these papers are well aware of this underlying duality their concern is different from ours. In fact, Alan Wilson (1970) provided an extensive discussion of the issues that, to a large extent, we are paraphrasing here. One of our aims is to bring these ideas to 15) interpreting his use of 'hypothesis' to be synonymous with our use of 'model': "A hypothesis is important if it "explains" much by little ... To be important, therefore, a hypothesis must be descriptively false in its assumptions ... Truly important and significant hypotheses will be found to have "assumptions" that are wildly inaccurate descriptive representations of reality, and, in general, the more significant the theory, the more unrealistic the assumptions." The result of these two narrative mismatches could be a (pre)historic Just-so-Story of homo economicus 1 1 , entertaining but little more.
Nonetheless, Friedman argued that what matters is whether the model 'works'. Rather than dismissing any narratives because they all encode detail that cannot be substantiated, we go for an approach as lacking in narrative detail as possible. It can be shown that in many contexts there is an epistemicontic duality which permits a model reformulation that brings modelling back to basic epistemic propositions rather than imputing specific 'economic' agent-driven activity. This enables us to reduce the need for agent narratives without losing our ability to answer the major questions.

Spatial interaction modelling
We are specifically interested here in how archaeologists borrow from the present in modelling the patterns of 'exchange' in past society, particularly between communities separated spatially. The natural framework for these questions is provided by spatial networks, a pattern of nodes connected by links. The nodes label the origins and targets of the exchange, in our case sites, whereas the links describe the interactions among them. Within the framework of such spatial interaction modelling contemporary economists, social geographers, transport analysts and urban planners have provided tools for modelling the key attributes of exchange. The potential relevance to archaeology is clear and has been realized successfully in many examples. In this paper we shall consider the use of two different 20 thcentury economic network models as templates for describing different social hierarchies in the same historical period (Middle Bronze Age) in the same place (central Anatolia): i) Assyrian trade routes in the early Middle Bronze Age (C19 BCE) in Anatolia (Barjamovic et al. 2017), explored in the framework of 20 th -century 'Ricardian' modelling of individual traders a wider audience of archaeologists than, from our experience, are currently familiar with them. As we have said, there is modelling that calls upon ideas from contemporary society in less explicit ways than these. However, the transparency of the models we consider here gives clear insights into the mechanisms of translating the present into the past that other applications may lack. We stress again that the analysis here is entirely within the framework of exchange networks.
Irrespective of the model, there is a major difference between contemporary economic and historic archaeological data. Economic data is typically 'big data'. Archaeological data, although voluminous, better characterized as 'lots of data', is poor statistically. Such data underdetermines the possible processes that lead to its presence. On the other hand, theoretical modelling of the type discussed here, which goes beyond the descriptive to have postdictive power, over-determines or overfits the data. This leads to a delicate balancing act to which we shall return. Of course, there is no reason why, for any particular archaeological data set, quantitative models of the type above should be satisfactory. Most simply, these work best when 'history is idling', when we can assume continuity of form, enabling us to go some way to fill in the gaps in the data. That is, we describe the periods of calm between the storms of war, famine and general disturbance and we follow the authors above in assuming that Middle Bronze Age Assyria is such a time, although this is not to say that there is stasis. We shall revisit this question later.
We are more concerned here with the underlying nature of the modelling and shall only discuss the data analysis of these models briefly. The reader is referred to the original papers for details.

Assyrian trade as Ricardian economics
We take the examples of Assyrian trade and urbanisation in order, and treat the second with greater brevity. We shall be as minimal in our use of equations as we can.
The underlying question posed by Barjamovic et al. (2017) in their analysis of Assyrian trade is straightforward and of more general interest than just the Assyrians: how do we locate communities/sites which we know to have existed from the historical record, but whose geographical position is uncertain? Their suggestion is essentially to consider them as illuminated trading 'beacons' and to triangulate their po-sitions from their (trading) 'intensity' to sites whose positions are known, on the assumption that they will appear 'dimmer' the further away they are from their trading partners. 'Exchange' here is limited to the exchange of 'goods'.
For the case in hand, Assyrian trade routes of the Early/Middle Bronze Age (19 th century BCE) are identified from a cache of 12 000 deciphered cuneiform tablets. Details do not matter for the discussion here, but the basic data set comprises the origins and destinations of the (only) 391 directed journeys involving exchange between the sites listed in these tablets. There are 26 named sites which are either origins or destinations, the positions of 11 of which are unknown. Because of incomplete exchange data (e.g., a comparable number of untranslated tablets), the absolute numbers of trips are a poor proxy for site activity. Rather, for each site the fraction of transactions from each of the other sites is estimated. The aim is to use these ratios of flows to triangulate the missing sites, using the known sites for calibration purposes. The paper is clever and nuanced, but a true understanding of its methods requires a crash course in economics for most archaeologists (as well as ourselves), and we present only the briefest explanation. The reader is referred to the original paper and its key references (most importantly Jonathan Eaton and Samuel Kortum (2002) and James Anderson and Eric van Windcoop (2003)) for a more complete understanding.
A simple summary with as little algebra as possible follows. Labelling sites by i, j, k ... with values 1, 2, ... 26 the assumptions made by Barjamovic et al. (2017) are a 20 th century version of Ricardian economic theory in which we assume risk-taking independent traders buying and exchanging goods (of different types w) with the aim to make the best deals subject to: i) No arbitrage (i.e. no guaranteed way to make a 'profit'). ii) Identical ad valoram 'iceberg melting' for all goods ω produced and acquired with different efficiencies. That is, within a cargo a similar fraction of goods of whichever type are needed to 'pay' (or are 'melted') for their cost of transportation and management. iii) Ad valoram 'melting' is reciprocal i.e. there is the same fractional penalty on goods going from site i to j as from j to i. iv) The outflow O i at site i can be identified with the 'site activity' S i . This qualifies the simple 'beacon' simile above while leaving it essentially correct. v) Constant 'Elasticity of Substitution'; this concerns the ability of traders to substitute one good for another if necessary. vi) The cost of producing one unit of w in any city i follows a Weibull distribution 2 2 depending on T i , the efficiency of sourcing goods from i and a parameter q > 0, a measure of the ease of distribution (inverse of costs).
All modelling makes simplifying assumptions about what constitutes exchange. Here, constant 'Elasticity' and 'Weibull-costing' allow us to aggregate different types of goods and efficiencies of production, whereas ad valoram 'melting' allows us to aggregate different modes of exchange with a common distance scale beyond which exchange gets difficult. The cumulative effect is that we can flatten 'exchange' from site i to site j into a single integer-valued directional label T ij which counts transactions (e.g., T 27 measures the number of exchanges from site 2 to site 7). It is sufficient for the moment to observe that these simplifying assumptions look anything but simple. Given the discomfort that many archaeologists have with algebra we have intentionally only partially decoded this language to highlight the difference in style and content between it and that of more conventional archaeological modelling.
The details of identifying the missing sites are not straightforward, and we refer the reader to the original paper (Barjamovic et al. 2017). In particular, they involve extremising utility functions representing consumer welfare/satisfaction subject to constrained finances -hence the jibe homo economicus for the actors in these pursuits. In defence, their argument is that the historic Assyrian merchants operated in a framework of trading contracts, judicial taxation, trading colonies and ports which make them as good a proxy for contemporary free enterprise traders as we can get.
For all the sophistication of the modelling assumptions the outcome is relatively simple, although for the reader not versed in algebra it may seem somewhat obscure. The resulting exchanges T ij from i to j following from these assumptions take the form of a generalised 'gravity' model (Erlander, Stewart 1990) where f av ij measures the ad valoram 'melting' as a function of site 'separation' between site i and site j. Increasing with distance, f av ij measures the increasing fraction of goods that are needed to pay for the exchange, and it is this which makes distant sites 'dim'. The s i are derived from the site activity variables S i by 3 3 (Σ j denotes the sum over j = 1, 2, 3, ... To get an understanding of what this means we will do a little further decoding. If we think of the S i (i = 1, 2, ... N = 26) as a measure of active population/ carrying capacity, then we think of the s i more as a measure of the strength or importance of the site, something more akin to site activity (e.g., Gross Domestic Product in a contemporary context). Suppose, for example, that the S i are equal. Then those sites which have more near neighbours with easier access for exchange accrue larger values of s i than those whose neighbours are more remote or less numerous and with which they exchange less by virtue of the cost/effort doing so. We finally note that reciprocity in melting leads to detailed balance; inflows equal outflows at all sites (last equality in (2.2)), suggesting that traders use the proceeds of exchange to implement new exchanges in a simple way.
The data on ratios of transactions (T ij /O i ) as j varies is sufficient, in principle, to triangulate the missing sites. Assuming f av ij grows as a power law with distance Barjamovic et al. (2017) get sensible answers that accord with our historical understanding on incorporating further contextual information. We shall return to this later. For the moment we shall also ignore the fact that the data set is very small. The question that we wish to pose now is one of principle. If we knew no economic theory could we reach results (2.1) and (2.2) by other means, in particular means that do not require the explicit actions of agents?
The answer is largely yes, as we shall now show.

Assyrian trade as the 'most likely' outcome ('MaxEnt')
The alternative approach which enables us to evade direct comparison with free-market economics can be characterised as no more than making the 'best guess'. By that we mean what we would expect to have happened, all other things being equal. This is an old problem, the question of how to make best use of partial information 4 4 . In principle we know what to do. We list all the 'worlds' which are compatible with our knowledge or, equivalently, ignorance, and assume that each is equally likely, otherwise we are withholding information. The most typical of these is the way in which the system is most likely to have behaved, and the question then devolves to one of identifying this state and the extent to which it more likely to be achieved than other competing states of the system. For the moment we concentrate on the first part of this question. This is such a familiar approach that we can forget that we are using it. A typical situation arises when playing cards; we know our hand and those cards which have been played and from that information make plausible guesses as to the most likely hands of our partners and opponents. Most simply, if we have one ace and only half the pack is dealt, our partner is unlikely to have three aces. The suggestion is that we might think of using our limited archaeological evidence in the same light.
It looks an almost impossible task to list all 'worlds' compatible with our limited knowledge. Remarkably, this making best use of the limited information can be quantified as the principle of maximum entropy ('MaxEnt') (Jaynes 1957;1979). Although we colloquially think of (Shannon) entropy as associated with chaotic behaviour, it can be thought of as the number of questions with which we need to interrogate the system to have complete knowledge of it. Think of the popular game in which you have up to 20 questions with yes/no answers with which to identify what your opponent is thinking about. Entropy is thus a measure of our ignorance about the system. From this viewpoint the 'most likely' 5 5 state of the system is the one with maximum entropy given our limited knowledge, since systems with less entropy assume more knowledge or have more implicit assumptions. Edwin Jaynes (1957;1973) has also rephrased this as the Principle of 'Maximum Ignorance' or 'Epistemic Modesty'. The use of entropy in this way, to identify the 'least surprising' of possible pasts, has been termed a 'superconcept' by Alan Wilson (2010), from whom much of the following is derived.
Implementing 'MaxEnt' is still problematic, but we adopt the 'law of parsimony or Occam's Razor' 6 6 , the principle that the simplest solution to a problem tends to be the correct one. Although the principle sounds straightforward it is difficult to formulate in general. However, as happens here, when presented with competing models of a similar form, we should select the one with the fewest 'significant' unknown variables and parameters 7 7 . Explicitly, for the models of this paper the parameters of the 'MaxEnt' models are a subset of those of the economic models, and parsimony provides a simple marker for delineating the different approaches. More generally, when the models do not permit simple comparison, we fall back on Bayesian analysis (e.g., Rubio-Campillo et al. 2017). The assumption 'all other things being equal' seems a flat Bayesian prior, but the extent to which 'MaxEnt' is itself 'Bayesian' is disputed (e.g., see Cheeseman, Stutz 2004 and references therein), and we will not take the discussion further.
As we have said, the analysis of Barjamovic et al. (2017) was predicated on sources of trade behaving as 'beacons' which become 'dimmer' the further we move away from them, exemplary of Tobler's First Law of Geography, that "near things are more related than distant things" (Tobler 1970). As a first step we show how Tobler's law arises as the 'most likely outcome' from a simple implementation of 'MaxEnt'. We avoid explicit algebra where we can. The interested reader can find a more mathematical analysis in several chapters of Wilson (1970), in Sven Erlander and Neil F. Stewart (1990) and in our recent work (Rivers, Evans 2014;Evans, Rivers 2017).

Tobler's 'first law of geography' as 'MaxEnt'
As a first guess, our parsimonious approach, which we take as our null model for exchange, assumes minimal 'global' 8 8 knowledge: (a) Exchange takes place but it is (collectively) limited in scope.
(b) Exchange 'costs' or takes effort, but only so many resources are available globally (i.e. collectively) (c) The 'cost' or the effort required for exchange increases with 'distance'. In practice, the cost of moving goods lies not just in the cost of their immediate transport, but also in the costs of sustaining the network. This will include supporting the agents and middlemen to enable the transactions to take place.
The outcome of maximizing the entropy of the system of exchange 'flows' subject to these global constraints is, indeed, 'Tobler's law' applied to exchange: that each site is connected to every other site and exchange decreases with 'distance' between sites. We do not have constant elasticity and ad valoram costing to fall back upon. Nonetheless, we assume parsimoniously that in the absence of further information, as a null assumption, exchange can be crudely characterised by a single number T ij whose value, if large, suggests strong exchange from i to j and, if small, weak exchange. For the case of Assyrian trade this will just be the integer-valued number of trips from i to j.
Then, in appropriate units, 'MaxEnt' gives the most likely configuration of exchange flows as 9 9 T ij = s i f ij s j (3.1) where the input s i are a measure of site activity, related to the active population of i, and f ij is the socalled deterrence or impedance function for flows from i to j, a reflection of the cost/effort of exchange from i to j, which decreases with increasing separation. We have recovered 'Tobler's law' by replacing 'cost/effort increases with distance' with 'exchange decreases with distance', a very plausible equivalence 1 10 0 .
What we have here is the simplest of exchange models, the 'Simple Gravity Model'. As yet it is so simple that it does not incorporate networking. Removing a site just erases its links without any need for rearrangement of flows -the whole is just the sum of the parts. This is as we would expect from just implementing global constraints which make no reference to individual sites. With this in mind, as a second guess we introduce local constraints for transactions. Most simply, we first adopt the idea from 'Proximal Point Analysis' (PPA) that the total exchange flowing from any particular site is limited, with inflows unrestrained. 'Proximal Point Analysis' has had considerable success in archaeology (e.g., Broodbank 2000;Terrill 1986) in assuming most simply that any site only has the resources/energy to interact with a fixed number of nearest neighbours 1 11 1 . We generalize this by extending our null model in which we replace condition (b) above by: (b) 'only so many resources are available locally', constraining the local outflows O i as in 'PPA'. Typically, in the absence of any further information, we take (in appropriate units) the total outflow equal to the site's local resources so O i = S i , as in Barjamovic et al. (2017).
In comparison to the simple gravity model this additional constraint gives us a 'Singly Constrained Gravity Model'. The addition of this local constraint is sufficient to network the model. For example, if we double the outflows and inflows we get the sensible scaling result that exchange flows double whereas, for the 'Simple Gravity Model' of (3.1), doubling the s i leads to a quadrupling in flows.
In practice, a single constraint is not yet sufficient to describe either Assyrian trade or, later, Assyrian citystate formation. Each of these requires something further.

Assyrian trade as the 'Doubly Constrained Gravity Model'
For the case in hand of Assyrian trade we make the further constraint (repeating Barjamovic et al. 2017) that, in the absence of more information, the deterrence function is reciprocal between sites; f ij = f ji for all i, j.
Insofar that f ij is a function of the 'effective distance' 1 12 2 d ij between the sites i and j this becomes the statement that these distances d ij = d ji are reciprocal, our parsimonious choice in the absence of further information.
This gives the 'Doubly Constrained Gravity Model', for which the 'MaxEnt' solution is 3) Because of the reciprocity in deterrence the final equation is also O i = S i = I i , for all i. As in (2.2), the s k are now not independent, as in (3.1), but determined in terms of the input S i or O i through the constraints (3.3)! We stress that there is no need to invoke individual agents behaving in particular ways.
In summary, on comparing (3.2) and (3.3) to (2.1) and (2.2) we see what we have termed epistemicontic duality. By this we mean that, once we accept reciprocity between the exchange effort/cost between sites, the 'most likely' outcome for finding missing sites based on constrained local activity without having to invoke agents directly is equivalent to the technically much more sophisticated 1 13 3 'Ricardian' model of free 20 th -century market traders with constant elasticity of substitution and efficiencies satisfying a 'Weibull' distribution, and so on. This is provided we identify (a) the outflows O i = S i in the two cases and (b) (f av ij ) -θ of Barjamovic et al. (2017) with f ij of the 'Doubly Constrained Gravity Model' (up to a fixed scale factor).
It could be argued that we are being disingenuous in downplaying the role of agents in 'MaxEnt'. That exchange occurs is a consequence of the presence of agents, and that it costs something is because of the efforts of agents. However, what we are saying from our position of ignorance is generic with no reference to the type of good exchanged, the means of exchange, the ease of production and access, let alone assumptions about seeking 'profit'. In fact, the simple requirement that deterrence or impedance to exchange increases with distance encodes no arbitrage. The model is to be thought of as a null model in which our coarse-graining of activity and 'cost' is taken as characterising some type of statistical averaging over the detailed activities of these agents, in this case in the framework of detailed balance.

Parsimony: primary and secondary problems
This comparison provides fertile ground for exploring the utility of parsimony, although some care is needed in its application. Trying to understand a trading network (or any historical system) poses several problems, often a primary problem which characterises the analysis (here, the positions of the missing sites) and a constellation of secondary confirmatory problems (e.g., the importance of these sites) which set the details. These latter may require more parameters which, given the uncertainties of network modelling, are likely to be less justifiable.
There is an analogy with our understanding of the Solar System which we find helpful. Essentially, the geocentric Ptolemaic/Aristotelian world-view positioned the Earth at the centre of the universe with the planets and the sun moving on circles embedded in spheres around it, whereas the heliocentric Copernican view had the sun at the centre with the Earth and the other planets moving around it (also in circles). The primary problem was whether the geocentric or heliocentric viewpoint was correct. In neither case were there 'laws of nature' to be invoked, in the way we understand the term today. At best there was an argument for circles on symmetric grounds as they permitted a Creator who could be the 'unmoved mover' as the planets circulated.
It was as much because the heliocentric view provides a conceptually natural solution to the 'wanderings' of the planets that the geocentric view was unable to do, rather than the data, that it prevailed. Neither picture worked well quantitatively for the secondary problems of how the individual planets behaved. In the intellectual framework of the time, in which the paradigm was circular motion, both possibilities required large numbers of epicycles (circles on circles) to fit the data even approximately. We know why this happens; Newton's laws mean that the planets move in ellipses, to which circles are a poor approximation, although a heliocentric system of circles is still the better null model 1 14 4 . The analogy that we would draw with archaeological modelling is that there are no 'laws' of society so our major aim is to identify the system as 'heliocentric' correctly (i.e. 'solve' the primary problem). Since epicycles are misleading conceptually, only serving as a means to 'save the phenomena' (Duhem 1969), we would argue that we should not expect to have reliable solutions to the secondary issues in the absence of hard data. This is probably the best that we can hope for. Beyond that we are back in the Just-So territory alluded to earlier.
The primary problem posed by Barjamovic et al. (2017) was that of identifying the position of the 'missing' sites. What is surprising is that, as we have seen, we get identical equations for the triangulation of missing sites from 'MaxEnt' provided we make the identification between the iceberg melting and deterrence functions stated above, and nothing more. This outcome is independent of the N=26 efficiencies T i of the paper and only dependent on q as an exponent in the combination (f av ij ) -q . Since we do not know either f av ij or f ij then q itself is a redundant parameter. We stress that we imposed distance reciprocity in our 'Doubly Constrained Gravity Model' as the most parsimonious choice that we did not have enough information to refute. If subsequent data shows that reciprocity cannot be supported, from our entropy viewpoint we just fall back to the 'Singly Constrained Gravity Model' of (3.2) and (3.3) with no symmetry. We are unaware of any corresponding 'Ricardian' counterpart (although see Ward et al.

2013).
We are not for the moment concerned with the success of the enterprise in Barjamovic et al. (2017), which calls upon supplementary historical data, historical road systems, estimates of carrying capacity and the like. Suffice to say, it seems to work 'well'. We consider the results of the paper a major contribution to the field.

Uncertainty and robustness
We close this theoretical analysis with some brief thoughts on the uncertainties of the estimated outcomes that relate to the size of the network, which is small by most network standards. Our ability to predict missing sites is conditional on these uncertainties.
Both the agent-related economic model and the 'Doubly Constrained Gravity Model' are deterministic in their (identical) expected values of exchange events. However, the extent to which these estimates are reliable differs in principle between the models. Nominally, the 'MaxEnt' approach with its 'greatest likelihood' stance seems at odds with a probabilistic interpretation. However, building on the work of Wilson (1970), Yee Leung and Jianping Yan (1997), have shown that the uncertainty that we attribute to the most likely 'Doubly Constrained Gravity Model' ('MaxEnt') flows is just what would be expected if, as far as possible, individual exchanges occurred independently of each other (i.e. with no memory of past events). That is, we have 'Poisson statistics'. It is not clear from our 'MaxEnt' viewpoint if the data set is too small for us to be able to draw reliable conclusions, particularly given the large number of geographic links with no exchange. We have had a related experience in applying cost-benefit analysis to Greek city-state formation, for which the distance scales were too small to prevent fluctuations that were large enough to force us to abandon that particular model (Rivers, Evans 2014).
The situation for the economic modelling of Barjamovic et al. (2017) is different by fiat. Their analysis is closely related to that of Eaton et al. (2012) and of João Santos Silva and Silvana Tenreyro (2006) who adopt estimators which are 'Poissoninfluenced' but not exactly 'Poisson'. In this way, additional calibration parameters enable them to get results that simple entropy prohibits. We are unable to determine to what extent this choice of variance is intrinsic to 'Ricardian' economic modelling, or is just adding further epicycles.

Data fitting
This leads us briefly to consider the problems with the data. For the case in hand we have N = 26 sites and only data for order N 2 links. With order N ('Max-Ent') calibration parameters (largely the S i and the coordinates of the missing sites) an acceptable match to the data is possible, without being tested by nonsymmetric exchange. Better data would probably make non-symmetric exchange untenable. It is not clear how to generalise this 'Ricardian model' to accommodate this (although see Ward et al. 2013).
Oversimplifying, in the first instance Barjamovic et al. (2017) minimise the least square correlation between the predicted ratios of transactions and the ratios recorded from the tablets as they vary the positions of the missing sites, constraining both parameter values and the missing site positions. They then do more, identifying multi-stop itineraries which refer to missing sites to further constrain their positions. As for site importance, they call upon supplementary data, e.g., historic road-systems. To check the robustness of the predictions they omit known cities in a random way to check if their positions can be successfully reconstructed from the data. As we said earlier, this is a subtle analysis not really germane to our discussion, and we refer the reader to the original paper.
As anticipated, a priori the two formalisms do not give identical results for the secondary questions concerning the 'importance' of the individual sites, since it is difficult to compare the economic and 'MaxEnt' models as they are used. Barjamovic et al. (2017), largely with an economics background, do not approach networks in the same way as archaeologists with a social networks background. While archaeologists adopt the conventional attributes of sites in networks such as 'PageRank centrality', 'betweenness centrality', and so on (Newman 2010) to describe site significance, Barjamovic et al. (2017) invoke 'autarky', a measure of site self-sufficiency, the antithesis of networking, to give importance to theirs. Whereas the latter does make use of the hitherto redundant parameters, the 'MaxEnt' results display the emergent properties of the network with no further parameters, parsimonious to a fault. Since we are not comparing like with like the two methods cannot agree in detail. Whether that matters in practice, given the uncertainty of the historical record, is equally unclear. As yet there is no 'Tycho Brahe' to improve the data.
However, from another viewpoint, this chimes with our earlier observation that the contemporary modelling may itself be unreliable. Indeed, it has been argued that constant elasticity of substitution and 'Weibull distributions' are introduced for their analytic solvability, rather than their representation of real systems (e.g., Spilimbergo et al. 2003). Further, ad valoram 'iceberg' melting does not even work when applied to the transportation of ice (Bosker, Buringh 2018).

Assyrian settlement structure and city-state formation
As we have said, Barjamovic et al. (2017) argued that Assyrians are a good proxy for contemporary free enterprise traders and that 20 th -century models should work in this case, but the argument for an epistemic approach is more general. This duality is present in our second example of city state formation in Bronze Age Assyria. It is sufficient to see how the modelling fits into our general theme, and we shall present it in less detail. This example may seem surprising since, although historic and pre-historic city-state formation have some contemporary and near-contemporary parallels, they seem to have little in common with the models of trade exchange familiar to economists along the lines of our earlier discussion. That a parallel can be drawn with 20 th -century economics is due to Wilson (1971;1976), who repurposed the free-market 'shopping' or 'retail' model of David L. Huff (1964) and Tiruvarur R. Lakshmanan and Walter Hansen (1965) to this end. Wilson (1971;1976) and Britton Harris and Alan Wilson (1978) argued that synoikism, the key ingredient of state-formation, has its counterpart in the patterns of department stores incorporated in shopping centres.

The 'Retail Model'
The basic assumptions of the retail model, in the terminology of retail outlets, are that: i) In equilibrium, retailing 'activity' (e.g., cash flow) is proportional to 'capacity' (e.g., floor space). ii) The aim is to maximise 'consumer surplus' subject to the constraint of fixed outflows. This enables us to convert 'capacity' into site 'attractiveness', measured through the inflows. iii) The inflows of the dominant sites partition space into zones of influence.
There are variations in the way that the model can be formulated but, most simply, the conversion of capacity into attractiveness is effected by the introduction of a further set of parameters Z i (one for each site) which reflect site activity, converted into site size. These are in addition to the flows T ij which determine the inflows. The Z i are determined by maximising the 'Marshall-Hotelling' (Hotelling 1929) consumer surplus. Again homo economicus looms large.
The final step is to relate the Z i to the final attractiveness, identified through the inflow I i , understood as the Z i equilibrium values. This evolution of the 'attractiveness' of a site to its equilibrium value is problematic. Most simply a linear response is adopted (Harris, Wilson 1978). More dramatically, it can be understood as treating the agent 'consumers' as 'prey' to the outlets, as determined by a non-linear 'Lokta-Volterra' approach (Wilson 2008). However, insofar as the required outputs are the equilibrium values, the details of the approach to equilibrium are not relevant as long as they avoid the 'perioddoubling cascades' that are a precursor to chaotic behaviour (Osawa et al. 2017).
It is clear that the 'retail' model is of a particular time and place, mainly late 20 th -century Western nations, for which it captures the 'death of the High Street' and the creation of malls. The arrival of the internet and online shopping has made the model largely redundant. We might expect the archaeological applications to be equally constrained in time and space, but the model was subsequently translated by Tracey E. Rihll and Alan Wilson (1987;1991) to describe the emergence of the polis in the 19 th century BCE mainland Greek Iron Age city states as a result of: • Synoikism: Surrendering of local sovereignty to a wider community. • Urbanisation: Emergence of dominant settlements.
We would not be so crass as to pair 1 15 5 Argos/Argos™, for example, but the way in which dominant sites arise which partition territory make the parallels between ancient and modern site-dominance plausible. Its success in this case, despite some caveats (Evans, Rivers 2017), has led to several successful subsequent applications: e.g., Bronze Age Crete (Paliou et al. 2016;Bevan et al. 2016), La Tène West Europe (Filet 2017) and Middle Bronze Age Anatolia (Davis et al. 2014;Palmisano, Altaweel 2015), as discussed below. Once urbanisation has been implemented, the model is exhausted.

Assyrian settlement structure
The applications of the retail model that we consider here are that of settlement formation in the Middle Bronze Age and Iron Age Khabur triangle (Davies et al. 2014), complemented by the work of Alessio Palmisano and Mark Altaweel (2017) who extend this approach to settlements in Middle Bronze Age Central Anatolia. That is, in part we are looking at Assyrian society at approximately the same time and place as Barjamovic et al. (2017) but at a different level of organisation, of settlement rather than individual traders.
What interests us is that the equilibrium 'consumer surplus' extremisation of the retail model permits a re-interpretation as the maximisation of a constrained entropy (Wilson 1970), also invoked by the authors above. To implement 'MaxEnt' we return to the 'Singly Constrained Gravity Model' of the previous section with its local constraints on outflows (also assumed in the retail model). As with the case of Assyrian trade, we need to impose an additional constraint on our generalised inflows. The creation of zones of influence around dominant city-states is an asymmetric process. Rather than the local constraint of detailed balance between inflows and outflows imposed on traders, we adopt the global constraint that the entropy of the inflows from the burgeoning city-states is fixed (Rihll, Wilson 1991). Suffice to say that if we were to implement this final constraint alone we would have (up to a multiplicative constant) T ij = I i g f ij (5.1) where, in the absence of any further information, we have set all outflows equal (as in 'Proximal Point Analysis'). We see that, for 'attractiveness' 1 16 6 g > 1 sites with larger inflows become dominant at the expense of the rest, commensurate with synoikism. Imposing the other constraints makes (5.1) much more complicated 1 17 7 . Nonetheless, the primary question of determining the dominant states has a solution essentially replicating the equilibrium behaviour of the 'retail' approach, showing a few dominant sites which partition space into zones of influence. We note that we can still preserve the symmetry f ij = f j,i and d ij = d ji . The asymmetry in the outcomes arises from the asymmetry in the way we treat inflows and outflows.
Since city-state formation permits a 'MaxEnt' description with no direct reference to agents, the epistemic-ontic duality is seen again, although parsimony is implemented differently here in two ways. Most simply, the first accords with our earlier simple definition, in that the difference between 'retail' and 'MaxEnt' lies in doubling the number of variables in the 'retail' approach. These collapse to a single set in equilibrium, those of 'MaxEnt'. As a result the equilibrium site ranking is the same in both approaches.
Secondly, the more profligate approach permitted by the retail model lies in the way that it provides narratives for the evolution of the system to its equilibrium state (Harris, Wilson 1978). As for the implementation of the model, Toby Davies et al. (2014) adopt non-linear 'Boltzmann-Lokta-Volterra' predator/prey dynamics whereas Alessino Palmisano and Mark Altaweel adopt linear 'Boltzmann-Lokta-Volterra' dynamics. That is, with 'Lokta-Volterra' 'time' understood as historical time, in principle the retail model allows us to address the diachronic 'secondary' issues as to how site differentiation might arise, unavailable to 'MaxEnt'. How seriously we should take these narratives is a separate issue, insofar as they are not used in data comparison. There is a potential problem in that if the retail model solution is an 'attractor' to the deterministic 'Lokta-Volterra equations' our narrative looks to be one of effective historical determinism. This can be avoided by the 15 Where the first name is from Archaic Greece, the second a U.K. retail chain. 16 g is the Lagrange multiplier associated with the fixing of inflow entropy. 17 In fact, in the solution of the dynamical flows a further constraint between inflows and outflows is imposed which goes beyond the original shopping model and its original archaeological applications. This does not change the nature of our argument. explicit inclusion of multiplicative noise in the 'Lokta-Volterra' equations (Ellam et al. 2017), but simple 'MaxEnt' evades this problem by only providing equilibrium site rankings.

Data
Insofar that it is the equilibrium values which are used for data analysis, the clear separation into primary and secondary questions that we found so useful for traders is not relevant because of the identity of the outputs.
Unlike the case for Assyrian traders, the input data here is largely site populations and positions, and the model outputs are not individual flows but site inflows identified with site size. There are several ways to correlate these outputs to the data that look for effects that go beyond our expectations from geography alone. To abbreviate a complex analysis in each case, the primary comparison of Palmisano and Altaweel (2015) is with conventional network analysis. From site inflows they construct a 'hierarchical Nystuen-Dacey' network (Nystuen, Dacey 1961) that encodes synoikism. The resulting zonal network is then analysed with conventional centrality measures. For Davies et al. (2014) stress is put on site size distributions rather than on the individual sites themselves. As in Barjamovic et al. (2017) in each case robustness is demonstrated through partial dataset sampling. Both papers do an excellent job of making their cases, and we refer the reader to them for details.

Discussion
There is no doubt that 20 th -century economic models have proved very useful in motivating archaeological models with the same structure, adopted almost unadorned by their historical context. In this paper we have argued wherever possible for an 'epistemically modest MaxEnt' approach (Jaynes 1973), which enables us to avoid an explicit narrative of agents constrained by detailed behavioural rules whenever possible.
Although the examples discussed here are very different, both economic and 'MaxEnt' approaches rely on maximization in different ways: • Economic models assume the maximization of benefit to traders or sites, perhaps by the extremization of utility functions, the definition of homo economicus adopting rational economic behaviour. • 'MaxEnt' models make use of the more general extremization of entropy or, equivalently, make the best use of limited information, usefully rephrased as the 'Principle of Maximum Ignorance' (Jaynes 1957;1973).
For the examples here we have seen that, in the main, these different ways of looking at the same primary problems ('missing sites' and 'dominant sites', respectively) give the same key results.
We have made little reference so far to 'Bayesian' analysis, but it could be said that, insofar as we are swapping homo economicus for a flat 'Bayesian prior', we do not need to know economics to answer the primary questions for the models given here. However, from a viewpoint of parsimony this epistemic-ontic duality is not evenly balanced. Contrast the list of assumptions made in 'Ricardian' and 'Retail' modelling with those of the 'Constrained Gravity Models'. That these models can be put in correspondence with 'MaxEnt' shows the redundancy in the economic modelling assumptions when addressing the main questions for large enough systems. This redundancy will not apply to secondary questions to which the models will give different answers.
As we have seen, the situation is different for small systems, but our goal in this paper has been more about generics.
Whether the data and the models are trustworthy enough to give useful results to these secondary questions is another matter in the light of what we said earlier; assuming large enough data sets the number of parameters is small, even for economic modelling, such that we can only expect very broad agreement with data from whichever viewpoint. The example of Greek city-state formation (Rihll, Wilson 1987) is a case in point. Whereas there is very good reason for Athens and Corinth to be dominant states, within the modelling the significance of Thebes is more equivocal (Rivers, Evans 2014;Evans, Rivers 2017). That Thebes was as important as it was, where it was, is due to factors that our simple modelling cannot incorporate, such as the rise of one socio-political 'house' over another. However, a significant site somewhere in that region was to have been expected.
It is because of these qualifications about secondary issues that we reject the reverse engineering thatsince our 'least surprising' entropy results are, in the first instance, commensurate with free-market economic models -a free-market society is the 'least surprising', if not 'obvious' outcome, for describing exchange in this period. We would argue that the devil lies in the secondary details, as we see in Eaton and Kortum (2002) and Anderson and van Windcoop (2003). These are not the 'least surprising' outcomes since they rely on subsidiary information. This is why economists preserve their complicated models rather than use 'MaxEnt'. Without commenting on the reliability of their models their data is generally so good that the broad brush approach of 'MaxEnt' is inadequate.
For large enough systems this difference between the economic and entropy-maximizing approaches is seen most clearly in how they address temporal change, to which we have referred. Whereas economic models are dynamic, 'MaxEnt' looks for equilibrium behaviour. Change can be accommodated in 'MaxEnt', e.g. an overall increased difficulty in travel due to banditry/piracy affecting the exchange patterns in the South Aegean (Knappett et al. 2011) and in the more dramatic case of the eruption of Thera (Rivers 2018). A similar parameter shift occurs in Davis et al. (2014) to describe changes in settlement patterns between Middle Bronze Age and Iron Age sites. However, these are exogenous effects unlike the endogenous behaviour encoded in the 'Lokta-Volterra' equations of the retail model (Wilson 2008;Ellam et al. 2017). We might argue that, from an agent-related economic viewpoint, 'history' is an attempt to achieve 'good' functionality from a non-optimal beginning whereas, from a 'MaxEnt' viewpoint, 'history' is an attempt to maintain 'good' functionality at all times as circumstances change.
These two models are not the only C20 th economic models that have been translated to the historic past. In particular, there are 'Intervening Opportunity Models' which assume that transactions between two sites i and j are proportional to the number of 'opportunities' at destination site j and fall off inversely with the number of 'intervening opportunities'. Introduced by Samuel Stouffer (1940) and developed by Morton Schneider (1959), this approach was used to model commuting patterns, but it has a natural extension in archaeology where, in an extreme form, it occurs as 'Proximal Point Analysis', as mentioned earlier. There is a 'MaxEnt' realization of the general model, albeit in a slightly tortured way (Wilson 1970). Beyond 'Proximal Point Analysis' (e.g., see Broodbank 2000;Terrill 1986) it has been applied to Mediterranean maritime exchange by coastal tramping in the Late Bronze Age (Rivers et al. 2016).
Of course, there are models based on 20 th -century economics (e.g., cost-benefit analysis, in which we look for the 'best' outcome) that seem to have no direct epistemic counterpart in terms of making the best use of information, but that is another story 1 18 8 . Our aim here has been the more limited one of trying to demystify unnecessarily complicated economic machinery that has been used to explain historic and prehistoric exchange.
One important class of models that we have not addressed is that of agent-based models, which can build on free-market behaviour (e.g., Brughmans, Poblone 2016). Nominally, they fall outside our analysis in that our emphasis has been on avoiding bottom-up narrative in favour of generic likelihood. In fact, agent-based modelling could not replicate an analysis as detailed as that of Barjamovic et al. (2017), although it can be built upon entropy maximisation (Altaweel 2015), perhaps bringing the best of both worlds. We shall not pursue this further.
So, in summary answer to the question in the title of this paper, archaeologists often try to impose a (free-market) present on to the past. However, for the models we have discussed here we find that in the first instance the significant results are nothing more than the 'least surprising' results that follow from maximising our ignorance ('MaxEnt') of freemarket behaviour under simple assumptions, the most parsimonious approach. The situation is different for secondary questions when free-market analogues are much less parsimonious (cf. 'MaxEnt') and require a large amount of additional detail. Given our poor understanding of the model parameters and the ambiguity of the archaeological data to which these models are applied, this can be spurious or unable to be substantiated. Nonetheless, it has to be said that the economic models, however suspect their detailed assumptions, do provide a rationale as to how different types of goods produced and transported differently can be aggregated. As such they motivate the simple averaging that happens with 'MaxEnt' null modelling, even if the results are not to be taken too seriously. However, that is the nature of null models, and it is not clear in general that we do any worse by continuing with 'MaxEnt' when possible. More case-by-case analysis is required but, arguably, being simple can be enough "pluralitas non est ponenda sine necessitate" (William of Ockham). 18 The difference may not be as great as it seems. For large systems there is a duality between the 'most likely' outcome and the 'best'.