Share Variable (data From File) Among Multiple Python Scripts With Not Loaded Duplicates
Solution 1:
Between the mmap
module and numpy.frombuffer
, this is fairly easy:
import mmap
import numpy as np
withopen("matrix_file.mtx","rb") as matfile:
mm = mmap.mmap(matfile.fileno(), 0, access=mmap.ACCESS_READ)
# Optionally, on UNIX-like systems in Py3.3+, add:# os.posix_fadvise(matfile.fileno(), 0, len(mm), os.POSIX_FADV_WILLNEED)# to trigger background read in of the file to the system cache,# minimizing page faults when you use it
matrix = np.frombuffer(mm, np.uint8)
Each process would perform this work separately, and get a read only view of the same memory. You'd change the dtype
to something other than uint8
as needed. Switching to ACCESS_WRITE
would allow modifications to shared data, though it would require synchronization and possibly explicit calls to mm.flush
to actually ensure the data was reflected in other processes.
A more complex solution that follows your initial design more closely might be to uses multiprocessing.SyncManager
to create a connectable shared "server" for data, allowing a single common store of data to be registered with the manager and returned to as many users as desired; creating an Array
(based on ctypes
types) with the correct type on the manager, then register
-ing a function that returns the same shared Array
to all callers would work too (each caller would then convert the returned Array
via numpy.frombuffer
as before). It's much more involved (it would be easier to have a single Python process initialize an Array
, then launch Process
es that would share it automatically thanks to fork
semantics), but it's the closest to the concept you describe.
Post a Comment for "Share Variable (data From File) Among Multiple Python Scripts With Not Loaded Duplicates"