**beep ** bop.

  • 0 Posts
  • 46 Comments
Joined 2 years ago
cake
Cake day: July 1st, 2023

help-circle

  • One thing about grafana, though, is that you get logs, metrics and monitoring in the same package. You can use loki as the actual log store and it’s easy to integrate it with the likes of journald and docker.

    Yes, you will have to spend more time learning LogQL, but it can be very handy where you don’t have metrics (or don’t want to implement them) and still want some useful data from logs.

    After all, text logs are just very raw, unstructured events in time. You may think that you only look into them very occasionally when things break and you would be correct. But if you want to alert on them, oftentimes that means you’re going from raw logs to structured data. Loki’s LogQL does that, and it’s still ten times easier to manage than the elastic stack.

    VictoriaMetrics has its own logging product too, now, and while I didn’t try it yet, VM for metrics is probably the best thing ever happened since Prometheus. Especially for resource constrained homelabs.


  • Storage box networking can be hit and miss. It’s ok for incremental uploads, but I went through hell and back to get the initial backup finish, which makes me wonder what it would take to download it in case I have to.

    Scp breaks off once in a while, and WebDAV terminates the session. I didn’t try smb as I feel it’s a rather weird protocol for the public internet. In the end, I figured it’s not the networking per se, it’s something with the timeouts on the remote, and I was able to finish the backup using a Hetzner-hosted server as a jumpbox.

    But it’s cheap, yeah.









  • Seq is expecting structured logs which yours aren’t. So you want to either convert your app’s logs into a structured format (which is generally hard for a random third-party application) or use a log collector that’s fine with non-structured logs (e.g. Loki+grafana don’t care about the shape is your logs and you can format the output while querying).




  • I have a dedicated vm for things that are crucial to the home network, either latency-critical or network related.

    That’d be my dns resolver (I enforce it over VLANs by hijacking anyone trying to do DNS to other resolvers, like random IoT devices), homebridge for less important home automaton and my own matter controller for most important home automaton (controlling the lights).

    My router of choice is RouterOS in another VM. I tried opnsense, pfsense, vyatta, and a bunch of others (even a containerized Cisco route), and I settled on ROS, because it was the only one who could do IPv6 properly (apart from Cisco, but that has other issues).

    For the less important things I run them on k8s and really, there are only two bits worth mentioning as essential: ArgoCD and nixhelm. Together, they provide effortless and mostly automated software updates with very easy rollbacks. I don’t have to go and manually update every single bit of software and that saves huge amounts of time.


  • That’s a good point. Mind that in most production environments you’d be firewalled rather hard (especailly when it comes to logs processing which oftentimes ends up having PII). I wouldn’t trust any service that tries to use DoT or DoH in there that I couldn’t snoop on. Many deployments nowadays allow you to “punch” firewall holes based on the outgoing dns requests to an allowlisted domain, so chances are you actually want to use the glibc resolver and not try to be fancy.

    That said, smaller images are always good in my book!


  • You’re nailing your goal then!

    I would still steer you slightly towards documenting your architectural decisions more. It’s a good skill to have and will help you in a long run.

    You have dozens of crate dependencies and only you know why they are in there. A high-level document on how your system interconnects and how the algorithms under the hood work will be a huge help to anyone who comes looking through your source code. We become better programmers not by reading the source code, but by understanding what it actually does.

    Here’s a random trivia: your server depends on trust-dns-resolver. Why? Why wasn’t the stock resolver enough? Is that a design choice or you just wanted to have fun? There is no wrong answer but without the design notes it’s hard to figure your intent.


  • This looks nice, but there’s plenty free alternatives in this space which warrants a section in the readme with the comparison to other products.

    You mention ram usage, but it’s oftentimes a product of event size. Based on your numbers, your average event size is about 800 bytes. Let’s call it 1kb. That’s one million events per day. It’s surely sounds more promising than Elastic, but not reaching Loki numbers, or, if you focus on efficiency, is way behind Victoriametrics Logs (based on peeking at their benches).

    I think the important bits you need to add is how you store the logs (i.e. which indices you build) and what are your trade-offs. Grep is an efficient logs processor which barely uses any ram but incurs dramatic I/O costs, after all.

    Enterprises will be looking at different numbers and they have lots of SaaS products to choose from. Homelab users are absolutely your target audience and you can have it by making a better UI than the alternative (victoriametrics logs aren’t that comfortable to work with) or making resource usage lower (people run k8s clusters on RPis, they sure wonder about every megabyte of ram lost) or making the deployment easier (fire and forget, and when you come to it, it works).

    It sounds like lots of things and I don’t want to be discouraging. What you started there is really nice-looking. Good job!