Extracted info, and started a new thread as the subject changed.
(...)
This is partly my fault, resulting from switching to seasonal memory for
name resolution in r45511. We call se_free_all() a lot, which means
calling host_name_lookup_init() a lot. It might be better to use a
different allocator for resolved addresses or to delay reading any hosts
files somehow. Either way we need to make sure resolved addresses don't
leak from one capture to the next.
See also bug #8349 (if the user exports a filtered subset of the
capture, only resolved names relevant to that subset should be
exported).
I think, in general, the resolved addresses that get written out on
save should be based on which packets get written out, not on which
names we have cached (looks like we'll need another member for
frame_data, oh joy).
Once that's done properly then we can look at cleaning up the caching
logic so that we don't have to keep rereading the hosts file. I
suspect the simplest and best method is to never flush the cache - I
can't imagine it getting unreasonably large, and it means we never
have to look up the same address twice.
I think we should consider how we want this to work and the performance hit of implementing it.
I can also see the need for making writing out address resolution block optional.
- What is the rationale for limiting the address resolution to IP addresses in a subset of a larger file? It's nicer but is it worth the effort/performance hit?
There is also a use case for a fat resolution data base as the info can be extracted and put in a hosts file in a profile for later use.
- Flushing out the cache between loading of files is needed I think as the files may be from different private networks with overlapping IP:s
- security/privacy issues, but if you are concerned perhaps address resolution should be turned off.
... ?
In our labs I think hosts files are more commonly used than concurrent name resolution for performance reasons.
Regards
Anders