android - Compact alternatives to Java ArrayList<String> -
this question has answer here:
i need store large dictionary of natural language words -- 120,000, depending on language. these need kept in memory profiling has shown algorithm utilises array time bottleneck in system. (it's spellchecking/autocorrect algorithm, though details don't matter.) on android devices 16mb memory, memory overhead associated java string
s causing run out of space. note each string
has 38 byte overhead associated it, gives 5mb overhead.
at first sight, 1 option substitute char[]
string
. (or byte[]
, utf-8 more compact in case.) again, memory overhead issue: each java array has 32 byte overhead.
one alternative arraylist<string>
, etc. create class same interface internally concatenates strings 1 gigantic string, e.g. represented single byte[]
, , store offsets huge string. each offset take 4 bytes, giving more space-efficient solution.
my questions a) there other solutions problem low overheads* , b) solution available off-the-shelf? searching through guava, trove , pcj collection libraries yields nothing.
*i know 1 can overhead down below 4 bytes, there diminishing returns.
nb. support compressed strings being dropped in hotspot jvm? suggests jvm option -xx:+usecompressedstrings
isn't going here.
i had develop word dictionary class project. ended using trie data structure. not sure size difference between arrraylist , trie, performance lot better.
here resources helpful.
https://en.wikipedia.org/wiki/trie
https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/
Comments
Post a Comment