Re: CF: object structure layout.

To: crossfire@ifi.uio.no
Subject: Re: CF: object structure layout.
From: Mark Wedel <mwedel@scruznet.com>
Date: Wed, 16 Jun 1999 04:30:09 +0000
References: <Pine.LNX.4.04.9906131327480.4436-100000@staberinde.netpolicy.com> <37641781.319D7EC9@scruznet.com> <1rso7uvonp.fsf@haey.ifi.uio.no> <3765D908.3940B3D9@scruznet.com> <1r4sk9vcao.fsf@haey.ifi.uio.no>
Sender: owner-crossfire@ifi.uio.no

Kjetil Torgrim Homme wrote:

> >   /* To see better details about this, look at doc/objects */
> >   typedef struct obj {
> >       struct obj *next;       /* Pointer to the next object in the free/used */
> >                               /* list */
> >       struct obj *prev;       /* Pointer to the previous object in the */
> >                               /* free/used list*/
> >       struct obj *active_next;    /* Next & previous object in the 'active' */
> >       struct obj *active_prev;    /* List.  This is used in process_events */
> >                                   /* so that the entire object list does not */
> >                                   /* need to be gone through. */
> 
> What are the advantages of keeping used and free objects in the same list?
> 
> I think it would be better to separate this into two lists, this
> reduces the object size and results in less pointer manipulation when
> allocating (I think).

 I assume you mean active/non active there, and not free/used (free and use are
already on seperate lists.)

 For active/non active, you are probably right - seperate lists makes sense. 
The above was a carry over from when I added the active list - instead of
updating all the list code, I just added the additional fields.  Since most
everything is going to get redone anyways, there is no reason for that.

> 
> Another point which I've been thinking a bit about lately, is the
> cache friendliness of code like this.  If we have 4000 objects @ 512
> bytes each, we're blasting through 2 MB of data when traversing the
> list.  This _really_ thrashes the cache and memory (and 4000 objects
> is very conservative).  A memory stall is extremely expensive these
> days (it's measured in hundreds of nanoseconds), so it would be good
> to take this into account when redoing these structures.

 Generally, we only go through the active lists, which tend to be smaller.  And
on further thinking, the list of 'active' objects could perhaps be even further
reduced - animated objects are considered active (to update face information) -
that can all be handled on the client, so any object which is scenary but is
animated would no longer need to be on the active list.

> 
> The goal is to reduce the amount of data accessed when going through
> the linked list.  This means the the object is split in two, one small
> chunck with the next pointer and the most often accessed data, and a
> pointer to the full data.  E.g.:
> 
> struct obj {
>         struct obj *next;
>         float speed;
>         float speed_left;
>         uint32 tag;
>         struct objdata *data;
> }
> 
> This reduces the object size to 20 bytes, and the above example of
> 4000 objects fits into 80 kB.

 I guess the performance may depend on the compiler/cpu and how it deals with
cache loads?  For example, if the object is 512 bytes lets say, but you are only
accessing stuff in the first 20, does it still load all 512, or does it only
load the 20 bytes?  My knowledge in that area isn't that good, but if it is only
loading the 20 bytes or whatever, then the key is to organize the structure so
the most needed fields are first.

 You probably need to add the map pointer to the above.  The main object
traversal is each tick when it goes and sees what objects need to do something. 
At times, that traversal needs the map object.

 Another issue is that for the objects that do need to do something, the rest of
the object structure is going have to get loaded anyways.  So if you wanted to
get more clever, you could have various run queus - ie, the 1 tick away run
queue, the 2 tick away, 3 tick,.. to 5+ tick away.

 And these cycle around.  This doesn't work well for things that change speed,
but typically that is only the player which is likely to be handled special in
any case.  If when processing a queue, you run accross something that isn't
ready to run yet, it gets placed back into the best estimate again for next run
time.

 But frankly, if we do follow up with some of the proposals on future game play
and reduce the number of monsters, and do some other clean up, the only thing on
the active queue will be monsters and spell effects (firebolt, lightning bolt,
etc) - all animation will be done by the client.  If that is done, I don't see
the active list being all that large even with quite a few players.

> <aside>
>         I think the speed and speed_left should be redone to use
>         integers and timestamps.  I.e.
>             t = pticks << 8;  # 32 bit ints => 24 days before wraparound.
>             for each obj {
>                 if (obj.wakeup >= t) {
>                     obj.wakeup += obj.speed;
>                     do stuff (obj);
>                 }
>             }
>         obj.speed is calculated when loading the map as:
>         obj.speed = (1/speed) * max_time * 256;
> 
>         The advantage of this is that there is only one compare for
>         each object, instead of a subtract, compare and a write back
>         into memory.

 That sort of goes a little with the above.  The disadvantage mentioned above
(varying object speeds) still apply.  But actually calculating when the object
is due to do something could be useful in other regards - monsters or the like
may be able to make better decisions if they can see their friend isn't going to
do anything for 10 ticks.

> I think we should consider using a garbage collector for this.  I'd
> like to use <URL:http://reality.sgi.com/boehm_mti/gc.html>, it sounds
> very cool, and perfect for our purposes (its assumptions will hold
> true, so it will seldom guess wrong, I think).  I haven't tried using
> it before, though.  In this case, perhaps it is best to keep a common
> list for free and used objects, but let the "usedness" be denoted by a
> NULL objdata pointer.

 Crossfire of course already uses some form of reuse be keeping a free object
list.  The current limitation is that it never recycles free objects, and some
other data in the object does get freed/allocated often (some strings - although
most use the shared string library)

 In some sense, I would like to keep things simpler for now - at current time it
doesn't seem to be stuff like the object traversal being a limiting time
factor/causing slowness on the server, but other tasks like map loading/saving.

 I know in one of my mail messages a while back following up on David's idea of
a more unified scale would be to only process objects in the vincinity of the
player (so if you had a 50x50 map for example, only objects with 10 spaces of
the player gets processed).

 Even if that doesn't happen, it could make sense to store active objects on a
per map linked list, so you can completely ignore objects you don't need to
process.  You could do something like:

1) Keep track of last tick a player was on any particular map.
2) Have individual maps be setable for their swapout time and how long to still
move monsters for after last player leaves.
3) Only process objects for maps players are on or that fall into parameters set
in #2 above.

 There are advantages here:
1) Some commonly used maps could have very long swap out times without adversely
affecting server performance (only memory for hte objects in ram).
2) If the only referances to objects on a map are within the map object
itself (and not global lists), it should be easier to require threaded load/save
routines since there would be less synchronization needed with other lists.
3) Following up with #2 above, it would be quite conceivable for each map to
have its own thread to handle the creatures on the map.

 I think with the move towards SMP systems and the fact that various actions can
cause the server to block, thinking about possible ideas to utilize threading
(and thus, multiple processors indirectly) would probably be a good thing.
-
[you can put yourself on the announcement list only or unsubscribe altogether
by sending an email stating your wishes to crossfire-request@ifi.uio.no]

Follow-Ups:
- Re: CF: object structure layout.
  - From: Kjetil Torgrim Homme <kjetilho@ifi.uio.no>

References:
- Re: CF: object structure layout.
  - From: Kjetil Torgrim Homme <kjetilho@ifi.uio.no>
- Re: CF: object structure layout.
  - From: Mark Wedel <mwedel@scruznet.com>
- Re: CF: object structure layout.
  - From: Kjetil Torgrim Homme <kjetilho@ifi.uio.no>

Prev by Date: Re: CF: object structure layout.
Next by Date: Re: CF: object structure layout.
Prev by thread: Re: CF: object structure layout.
Next by thread: Re: CF: object structure layout.
Index(es):
- Date
- Thread