Anyone using zram and similar memory management with 32gb of ram or more?

kiol@discuss.online · 2 days ago

Anyone using zram and similar memory management with 32gb of ram or more?

ISO@lemmy.zip · 2 days ago

It’s shit info. zram is actually better, more so with high ram size+high usage situations.

floquant@lemmy.dbzer0.com · 2 days ago

Anything you’d like to dispute specifically or we should just take your “it’s shit” over a detailed explanation?

moonpiedumplings@programming.dev · 22 hours ago

In my testing, zram has much, much better compression than zswap.

The points about LRU inversion, cgroups, and so on are valid, but at the end of the day, I don’t really care. I was able to open as many firefox tabs as I wanted with zram, but I could not do so with zswap, and that’s what matters to me.

The author of a blogpost is a facebook engineer. Millions of ultra high performance Linux servers are a very different usecase than a single desktop. It’s perfectly reasonable for a solution for one to not be appropriate for the other.

Copied from my previous comment about this where ISO also gave a similar reply and was met with a similar response lmao.

Atemu@lemmy.ml · 9 hours ago

This testing compares apples to oranges. Differently sized swap and quite obviously different workloads. Given how very much compress ratios depend on the specific data that is compressed, this experimental setup cannot produce valid results.

This is exacerbated by your swap being full. Zswap is more of a cache in front of your actual swap; it requires physical swap to function. If the physical swap is full, it cannot receive more data! Zswap not doing very much when the swap is full is totally expected behaviour because it simply doesn’t. The solution to that is to size your swap sensibly. (Admittedly, this does not appear to be documented clearly.)

zswap uses the exact same allocator as zram these days (zsmalloc). It’d be very surprising if it had different space efficiency characteristics. It’s not impossible (could be a bug) but claiming so would require quite certain evidence IMHO.

RE: LRU inversion: the problem with not caring about it is that it’s not a visible problem until it very suddenly is. Your system will not gradually degrade but very suddenly and unpredictably hit a wall that it cannot get itself over.

ISO@lemmy.zip · edit-2 45 minutes ago

LRU inversion: the problem with not caring about it is that it’s not a visible problem until it very suddenly is. Your system will not gradually degrade but very suddenly and unpredictably hit a wall that it cannot get itself over.

All this talk just confirms my feelings that there is a general lack of understanding of actual modern workloads.

RAM (normal w/wo zram) doesn’t get full, then stay full forever in real workloads. Not only is that not realistic at the “opened apps”/“running processes” level, it’s not real at the heap allocation level within tasks within processes. And this is much more pronounced with code written in modern languages like Rust and some styles of C++. Modern heap allocators batch and cache (primarily to help with performance). But still, A LOT of memory is getting allocated and deallocated all the time, even from the kernel’s PoV.

LRU itself is an imperfect approximation, not a goal. In the setup described in my other comment (fast SSD swap storages only used sparingly most of the time), so called LRU inversion gets auto-cancelled relatively quickly, as free space in RAM(+zram) gets available all the time, and some “LRU-hot” pages in SSD swap turn out to be actually cold, and those ones are the only ones that actually stay there.

This is why, I would imagine a lot of fake scenarios, and “benchmarks” based on them, may fail to replicate the practical reality of many (overall system) use-cases.

More tangentially, the oversized concern for file caching pages also points to specific aligned use-cases in mind, as if everyone is running DB-centric workloads or something.

floquant@lemmy.dbzer0.com · 19 hours ago

It’s not the opinion itself, it’s just the attitude. Your comment is a perfect example of what I consider a good reply as you brought both hard data and some nuance in expressing how you formed your opinion

Gamma@programming.dev · 2 days ago

Shit info from a kernel dev who works on the memory management subsystem?

ISO@lemmy.zip · 24 hours ago

Alright, I will only reply to you, since you raised a fair question.

First of all, I must admit that I thought what was linked was an earlier similar writing, but the general theme is still the same.

The problem with the writing is that it focuses on use-cases like Android and some servers, but doesn’t take into account other use-cases. It also seems to come with the assumption that setup is done by the distributor only, or if it’s done by the user, it’s a configure-and-forget situation.

What he represents is:

Limited RAM space
Swap will always/often happen (outside of (z)ram)
Single tier of non-RAM swap
Non-ram swap is significantly slower
OOM can be preferable over (outside of ram) swapping
Swapped out pages stay where they are until they are required by their process (important).

Now let’s look at a possible modern workstation setup:

Large RAM size
Swap is rarely hit, especially if set up with zram.
Multiple swap tiers beyond zram/zswap
- Intel Optane disk used as a super-fast zram write-back device, or a high-priority swap
- Fast NVME disk used as a second tier swap disk
- Large HDD swap partition used as a third tier swap disk
The biggest consideration is avoiding worst case latency, i.e. hitting HDD swap.
Killing processes MUST be avoided, unless exceptional circumstances are hit where the kernel’s OOM would kick in anyway. This holds true even when HDD swap starts getting used.
When unusual loads are observed, swapped pages can be moved around by the user (or a tool), by turning swap devices off and on. This is how you can empty the HDD swap partition for example.

This last point in particular should make it clear why his “imagination” was rather limited in his LRU inversion section.

Atemu@lemmy.ml · 8 hours ago

Swap is rarely hit, especially if set up with zram.

This is not a good thing btw. Any unused anonymous page takes up space that could instead be used for file-backed pages that make your system faster.

Multiple swap tiers beyond zram/zswap

Swap is not tiered storage!

Priorities control order of preference, not tiers. If you run out of space on a higher priority, it will not move that swap’s data to a lower priority swap. It will keep all of it exactly where it is and new data will hit the lower prio swap instead, no matter how hot it is.

Intel Optane

Cool tech but it’s dead and was quite niche even when it was alive.

zram write-back device

Not a thing you actually want to use for swap. It’s not an automatic writeback that is integrated into the Linux MM in any way. (Probably has some use-case for non-swap zram purposes though.)

Large HDD swap partition used as a third tier swap disk

This makes no sense at all unless you are extremely space-constrained on the NVMe and absolutely must not OOM – even if progress stalls to an absolute crawl.

swapped pages can be moved around by the user (or a tool), by turning swap devices off and on.

This is neither feasible nor desirable. You don’t have enough granularity to do anything useful by doing so.
Even if you had, it’d work against the MM because it resurrects pages as “hot” that have been cold for a long time.
In any situation where swap is important, making the kernel think cold pages are hot is the very last thing you want.

I too wish it were but tiered/transcedental memory is not a thing in Linux and these hacks do not change that fact; they merely look similar if you don’t look close enough.

I cannot think of a single use-case where this would be preferable to a decently sized physical swap with zswap XOR just zram swap (if physical swap is infeasible).

ISO@lemmy.zip · 2 hours ago

This is not a good thing btw. Any unused anonymous page takes up space that could instead be used for file-backed pages that make your system faster.

Can you expand here. I think my attempt at brevity in this part wasn’t helpful.

Swap is not tiered storage!

I meant tiered with priorities only, yes.

Cool tech but it’s dead and was quite niche even when it was alive.

We are not talking about the original purpose of Optane as supported on Windows. It’s just a (perhaps somewhat outdated) example of a storage device “smaller but faster than your average SSD storage”, which is very much not did tech.

Not a thing you actually want to use for swap

Depends on the use-case. But yes, this can also be used as the fastest disk tier/priority of normal swap devices, which is why I mentioned both.

This makes no sense at all unless you are extremely space-constrained on the NVMe and absolutely must not OOM – even if progress stalls to an absolute crawl.

Why would you want to see killed processes when you go back to your workstation, in the 1/10000th scenario where something runs amok pushing memory usage to unexpected high levels? When you can simply investigate the reason behind the rare occurrence, then move all the pages off the slowest devices immediately with swapoff?

dreugeworst@lemmy.ml · 14 hours ago

Intel optane? is there even any advantage left for optane compared with a fast, modern nvme disk?

ISO@lemmy.zip · 13 hours ago

It was just an example of a “smaller+faster than your average SSD”.

^and ^I ^was ^mentioning ^something ^similar ^to ^my ^setup ^instead ^of ^an ^imaginary ^use-case.

MonkderVierte@lemmy.zip · 1 day ago

deleted by creator