Listening on a socket doesn't drain any battery when no data arrives unless the app does other things that actually use CPU. That's just what Google/Apple want you to believe so you depend on their proprietary lock in services.
Also like, how else would the Google / Apple services do it? Probably via sockets right? I guess you could do it in a pull-based approach on a timer, but that doesn't seem more efficient to me.
A single process waiting on multiple sockets is basically no more expensive than a single socket, but if each app has its own background process then that is more expensive. So for best performance you really want to delegate all the push-notification-listening for all the apps on a device to a single background process owned by the OS, but it'd be fine for each app to use its own push server (though of course most apps do not actually want to self-host this).