What is with you kids and DBs?

Discussion in 'Plugin Development' started by hash, Feb 8, 2011.

Thread Status:
Not open for further replies.
  1. Offline

    Plague

    I think there is no need to advocate sqlite, this thread isn't against DB usage, but against using sqlite for storing ten configuration options, etc.
     
  2. Offline

    Byteflux

    Sorry, was just making sure he/she understood that there are still valid circumstances even in a bukkit plugin in which you could choose to use something like it.
     
  3. Offline

    hash

    I'd still advocate JSON or YAML or XML or what-have-you over sqlite unless you've crossed into that gigabyte range with your data when the scalability matters. You don't need sqlite unless you need concurrent access control in a way that flock'ing won't provide... and that's not an easy claim to come up with either, when it comes right down to it. Flock supports "shared" aka "read" locking as well as "exclusive" aka "write" locking, and that's really all any system, databases included, can ever provide.

    So, as long as you say "you may choose sqlite as an option for coordinating multiple applications" and not "you must", then I'd say that's fair.
     
  4. Offline

    petteyg359

    I clicked the wrong line to copy, sorry :) Copied some lines, ran to dinner before the cafeteria closed, came back to finish the post and didn't double-check that I had the right plugins. But I said you might be okay using it, so please forgive me :)
     
  5. Offline

    croxis

    There is also the wide spectrum of machines to consider as well. On one end of the axis are dedicated machines that run minecraft and only minecraft with 20 gigs of ram. They obviously have no need to be using sql for webserver, on the other hand they do have a lot of ram to spare. Another axis you have the casual server host who, and this is just a hunch, is hosting on windows where setting up mysql is another chore. Another axis are the VPSers, from the high end again to the low end like me who did everything possible on my, industry minimum, 512 MB ram shard to free up as much memory possible.

    Another factor to consider is that some of these machines are also webservers that are running some kind of sql database anyways. My 512 fell into this subset.

    The crux of my argument is that as long as the hardware is available for use it does not hurt to use existing tools for the job. However that is not the universal case such as the casual host or the constrained VPS.

    I believe that SQL is underutilized by plugin developers. I envisioned bukkit to use a database for a whole server. Not per plugin, but the server. Some common elements would be stored to respective tables, such as a player table. Currently if I wanted to make my Civilization plugin to support economy I would have to hunt down each econ plugin I wanted to support, write and maintain the interface to them. However if the economy plugins just added and used a money field I could support any economy plugin that used that standard.

    Granted, plugin makers can agree on a json/yaml/xml/whatever persistent storage instead, however, due to human nature, the only way I can see that happening is if that interface was part of bukkit. Now it comes down to finite developer time. Either time is spent to implement a new system and something else is delayed (and also risk a new entry point for bugs), or use an existing system. My vote is use an existing, well established and documented system.
     
  6. Offline

    hash

    You're trying really hard to look at a banana like it's a nail. Databases are not a substitute for a good API; they're a substitute for when you're too lazy to make any API at all. This is really the same thing brought up by Toasty earlier.

    Using a database doesn't automatically help you out of the problem of having to actually THINK about every economy plugin you might want to integrate with. It's the exact same work for you to all agree on one YAML or JSON based scheme as it is for you to all agree on one database name and table schema and key name. Exact. Same. Why do you think that the "human nature" problems you alluded to would somehow be less present with a database than any other scheme of doing the same thing?

    When you want systems to interoperate like that, the correct way to go about it is to involve an API. If you use databases as a crutch because no one designed a proper API, you lose out on the ability to do all sorts of things. Suppose sometime in the near future somebody comes out with a plugin that wants to monitor to make sure an account in your economy system doesn't go below a certain threshhold, or if it does it takes some action. How do you do that with the database? Constantly poll the database to see what the value is every millisecond? You can do that, I suppose, but it would be ridiculously inefficient -- this is REAL inefficiency we're talking about here, the 100% CPU spin-and-drag-everything-to-a-halt kind, not the imagined oh-my-god-it-takes-so-long-to-write-1000-lines-to-disk kind -- and even then your application won't always work correctly, because someone could always drop below the threshhold and then pop back above again and those two actions might take place between your polling no matter how fast you loop it. If you do it with an API, you just have to improve the code base so that you can have listeners for doCashTransaction(int).

    (...which, you'll notice, is how the core of bukkit actually works: event listeners. Can you imagine if everyone developing plugins for bukkit had to poll a database for events constantly and then process them from there? And then bukkit had to poll it again to see who modified what events? *shudder* And then it was basically random in what order which plugins would notice and modify events, and might not notice them at all before they were modified out of recognition? And every single plugin as well as the core itself was doing 100% CPU spinning the entire time just to keep polling?)



    I don't see how it's relevant at all that some machines are already running SQL servers. About half of the machines under my administration are, yes. They have hammers, so to speak. So? That makes minecraft looks like you can swing a blunt implement at, but it doesn't make it an honest-to-god nail by any means. I actually -still- don't want minecraft involved with the SQL servers on my machines that already have them unless there's a reasonable reason.

    The fact that machines already have a DBMS installed doesn't change the fact that developers should really be able to think critically about what is and is not an appropriate tool for a job. Some of the responses to this thread so far clearly indicate that people simply have no idea how big or small the overhead in filesystems is in the real world, and no clue what kind of work load databases were actually designed to target, and that's concerning regardless of how widespread legitimate (or illegitimate) uses of DBMS might be.
     
  7. Offline

    Toasty

    True, though some of the most successful servers out there do, or have implemented a system like this in the past. World of Minecraft is a prime example.

    Not saying it's common on a per-server basis, but when it comes to registering and changing the membership of a large group of users, the benefits of a database become quite apparent. Even if all the data could fit in RAM.

    The support for databases is an important feature to include if there is any hope of sanely managing a large user base (in my opinion).

    Though I agree, most people running a minecraft server will not need a database.

    Again, correct me if I'm wrong, but most forums support adding custom information fields. phpBB, has this feature for example, and it's a very popular forum software package. Simply include an info field called "Minecraft Username" for people to enter their in-game name into. Another piece of software can easily be made that parses that field for information on the person's username.

    In phpBB, you can even force that field as a requirement upon registering. Recalculations can be done on a daily/hourly/minutely basis to cover people who may have mis-typed the name, or left it blank since they haven't yet bought the game.


    However, it's equally plausible to simply create a program that parses that field in the database and writes it to a YAML file. Though that adds another link in the chain, it's not necessarily a deal-breaker. Something to think about I guess.
     
  8. Offline

    hash

    Okay. If that's the only way to interface with phpBB, then do it. You have no choice, apparently, if phpBB made no other API (but being good programmers, I kinda bet they did and there's a php script of some kind you can call if you want to, too).

    Just don't mistake that for an argument that databases are a substitute for real APIs, and don't mistake the design situation behind phpBB as a design situation that applies to everything in the world. I've already enumerated the reasons why phpBB is a radically different case than bukkit plugin development and I'd like to not repeat myself.

    (Incidentally, I don't think feverdream was trying to say that your case is invalid. I suspect he was trying to say that in as much as we are opposed to the broad inappropriate and unthinking use of databases, your case probably isn't among the ones that are really suspect.)

    Okay, some forums support adding custom information fields. You're not wrong. It would be a real hilarious feat if someone someday produced a database where you -couldn't- add any fields. But what's your point? Some YAML schemes, I hear that they support arbitrarily named keys too! Woo-hoo. Was that somehow supposed to argue against any of the things I said about databases not being a magic wand for compatibility?

    I don't care how many custom fields you add, it doesn't change the fact that a database simply can not enforce all conceivable rules about a set of data, and it doesn't invalidate anything I was trying to express in that example about special ranges of restrictions: unless you really, truly don't need any kind of validation outside of the type, then you still have the exact same need to validate data regardless of whether you're storing key-value pairs in a relational database or if you're representing them in YAML. And how often is your data -really- completely free of some sort of special restrictions? Not very damn often. Even with the seemingly basic example of usernames... do you allow linebreaks in the names? What about null-bytes or other unprintable characters? Are you encoding strings are UTF-8? Is the other application doing the same, or have you just not noticed the incompatibility yet because no one has submitted data outside of ASCII yet? The VARCHAR type in a database schema is going to allow a lot of things that may not be semantically appropriate for your application, and thus your database schema is not going to magically save you from invalid data if several programs are trying to coordinate using it. You still have to face all of the exact same hurdles with interoperability that you would using any other format whether it be JSON or XML or YAML or whatever.

    If you're saying that you think that phpBB can enforce some kind of arbitrary correctness constraints on the data field, then that absolutely proves that you're missing something here. When any other application is allowed to modify the database, then no, phpBB can't enforce a damn thing (beyond the very limited provisions of the database schema itself). Because another application is allowed to modify the database. Do you see the tautology at work here? I don't care how often you set something up to poll the database and re-check correctness; read what I said above about polling. If you try to do that, first of all it's obscenely wasteful; and second of all, it simply can not be guaranteed to ever actually work!

    A single API (which can in turn be backed by a database, if appropriate... it can be configurable for that matter; that's the joy of proper abstraction) CAN implement any conceivable rule about a set of data, as well as provide features like creating notifications and events based on changes in data. This is what a seasoned programmer will do if given the option.



    I have to admit something: I have a sneaking suspicion that "because phpBB uses it!" is one of the number one reasons developers without a lot of experience immediately assume that it makes sense to use it in every situation. That's why I'm quick to point out the significant differences between the situations -- I can't believe it doesn't occur to people that PHP and java are different beasts!
     
  9. Offline

    croxis

    What is the empirical data to support that sql databases are inferior to json/yaml/xml flatfile databases?
     
  10. Offline

    Nodren

    Databases have a purpose. On a recent plugin I've been developing for managing clans and PvP stats related to clans, I opted to store clan configuration(such as hall location, name, users, etc) in yml. this means editing a yml file or typing a /command in game updates the configuration.

    I then opted to use sqlite for stat tracking. because it makes no sense to record linear data like who killed who, when and what clan each of them was in at the time of the killing in a configuration file. I think my combination of the two is a perfect world. Server admins can reconfigure a clan by editing a file and then typing /reload. Others can download the sqlite db, and view it with any program, or write a web app to parse the stats in a way I haven't built into my plugin.
     
  11. Offline

    Nathan C

    I love databases.

    Easy to setup, easy to backup, because instead of having to backup 1000 + player accounts, I only have to backup one small MySQL database.

    Easy, unless you of course running a Windows server, which in that case.........
     
  12. Offline

    MatCat

    Just to throw my 2 cents into this. For configuration options a DB is really not the best thing unless it is a largely complex software with a huge amount of config data. However on the plus side is that using MySQL makes it very easy to interface with a website, why have flatfiles that I have to connect to my host, and edit, when I can have an admin interface and DB stored data for easy access? Yeah this is silly for 2 lines of ascii data, but something like Permissions would be perfect for DB. Sure the config file isn't exactly large, but a DB would make working with it much easyer as someone could just write a PHP backend and easily modify settings. Personally in all of my plugins I am going to release I will support DB stored configuration data solely so I can have a web based interface that makes it super easy to have a frontend mangement system for it.
     
  13. Offline

    feverdream

    That is exactly what I meant; I am sorry if I was unclear.
     
  14. Offline

    Plague

    Right, why make things fast and easily readable when you can make bloatware and stuff just because you are lazy.
     
  15. Offline

    DerpinLlama

    Despite the valid arguments in this thread, I'm lazy. DBs do the work for me. I only tend to store configuration in flat file.
     
  16. Offline

    MatCat

    ROFL If I was lazy I would not be taking the time out to develop bloatware. I don't know about you but it is a pain constantly typing thousands of commands in minecraft chat, when a simple php script can give me a graphical interface that let's me do everything without typing a word, leaving me time to do other things that are more pressing.
     
  17. Offline

    Greasy Digits

    It's your fault that you're storing the player accounts in more than one flatfile. Regardless, creating a tarball of a directory or file is the exact same amount of work as mysqldump.

    Thanks for making this thread, hash. You beat me to it. I flat out refuse to install any plugins on my server that use SQLite. In 99.9% of the cases, it's unnecessary, and an indication of a lazy and/or inexperienced coder, neither of which I want to download plugins from. As for the "RAM is cheap" argument, For VPS users already running at the ragged edge of their memory allowance after allocating a massive pile to Java, a RAM upgrade is undeniably expensive. Decreasing the Java heap is usually not an option since it comes at the cost of server performance.

    What? You're using flatfiles wrong. Why can't your web interface load a JSON store from the server, queue up some changes, and merge them with the file on the server? Permissions wouldn't benefit whatsoever from moving to a RDBMS because the dataset is necessarily small and it's simple. Even with thousands of unique users and hundreds of plugins, it still makes no sense. I think the mistake you're making is that you're making the assumption that if data is stored in an RDBMS, it's somehow easier to manipulate client-side, and that's just not true.

    Best thing about this thread is that you can see who the crap coders are and know which plugins to avoid. ..and for Christ's sake, don't use XML as a data interchange format. Use JSON or YAML.
     
  18. Offline

    MatCat

    I think you got me wrong on that... I use flatfiles for config, but for any other data I am using a DB plain and simple.
     
  19. Offline

    Samkio

    @Samkio (PreviousPost).
    I couldn't resist :p

    As for the post.
    NO ONE CARES.
    My plugins use Flat/MySql/Sqlite.
    So the user can CHOOSE what they want.

    How dare you say they are "crap coders". Believe it or not the plugin developers are paid for what they do. They take time and effort to create plugins and if you can't be bother to open up your ftp and edit a flatfile they why should they put the effort into coding sqlite etc?
    SO STFU :D and go learn java. :)
     
  20. Offline

    Afforess

    I feel bound to inform you that this statement is actually a logical fallacy.
     
    Stone_Tigris likes this.
  21. Offline

    hash

    I have a number of issues with this.

    First of all, you appear to be of the belief that everyone who runs a minecraft server really enjoys setting up more interfaces and dealing with webservers (and probably other interpreted languages that have to be installed and set up at that point as well, I'd imagine). That's an interesting belief. I don't share it. And while that may be fine if your one plugin makes the whole world go 'round and no one ever needs anything else... but that tends not to be the case. And when you multiply the work of all that extra set up by every plugin whose author believes that all these extra set up is somehow easier than editing plain text with whatever tool you want... well, I think that can get weighty.

    Personally, sure, I have apache configured and secured and I have PHP set up, and if someone wants to write a python wrapper for some reason, I know how to get apache to load modules for whatever like that too. But I'm not exactly your average user, and even though I have years of experience with all these things and my webserver already includes vast swaths of my machine via symlinks... I still hate it when I have to add another damn service to just to be able to do some tiny little bit of fiddling to something.

    Second of all, I'm going to put on my security hat. When you say "connect to my host", I'm assuming you mean ssh.

    SSH is COOL.

    If you make a web interface as the only way to edit your database'd data, people have to set up a web server, set up the interpreters for your scripts, set up the database connections... and they have to do it all securely. Port forwarding on the firewall, making sure the admin script can't overwrite other parts of the webserver if the script wasn't designed carefully, making sure the admin script can only contact the right databases, making sure the admin script can only be logged onto by admins...

    And what, are you going to offer this admin panel over HTTP? Heh.

    See, while I'm wearing my security hat, I feel obliged to point out that anyone who does authentication over HTTP is... stupid. There's really no pussy-footing around here. Http sends your data in the clear. Including passwords. It is _not_ hard to sniff somebody's credentials from an HTTP connection. I do it just because I'm bored and on a public network sometimes -- fire up wireshark and just watch people's private data and login information fly by. It's seriously child's play to compromise anything that does authentication over HTTP.

    So, hopefully you're going to make sure this admin panel that you so desperately need for your database will only be accessible over HTTPS, and you're going to design your own full login system for this and make sure it's secure, right? Okay, so how many people buy signed certificates for their servers and have dedicated IPs? How many people have the technical skill to do all that stuff safely and securely?

    Oh, and I'm sure all of the people who use plugins like this will love having to keep all of the login systems straight, since everyone who makes a plugin with a web interface will now have to be re-inventing that wheel.

    Or you could just use SSH, and edit plain text. SSH, which is already available on any server. SSH, which encrypts your passwords and all of your session. SSH, which uses existing systems tested by thousands of people smarter than you over the course of decades to keep your passwords safe. SSH, which integrates with the single log on system of the entire machine. SSH, which lets you run any editing program of any kind you could ever need. SSH, which lets you forward entire graphical applications if you want to.

    Soooo.... explain to me here exactly how it is that you see web interfaces as so much easier than just using a storage format that people can interact with directly?



    I completely agree. If you're dealing with a system that has thousands of player datum and keeps each one in its own file, then yes, you'll definitely see an improvement in performance if you moved to a database. But you'd also see a definite improvement if you just put all of the player data in one file. At the end of the day, that's exactly what the database is doing for you -- putting it all in one file so you can read the data with one disk seek instead of a thousand disk seeks. The database isn't doing anything magical here.

    That is also indeed exactly what tarballs were designed for.


    Spot on. PHP, for example, has functions called json_encode and json_decode. Guess what they do? Transform php arrays directly into JSON and back again. Easy as pie. Combine this with the flock call for concurrency control, and you win.
     
  22. Offline

    Greasy Digits

    I think you parsed that statement incorrectly, but it was a little ambiguous, sorry. I wasn't saying that posting here implies that you are a crap coder. A more unambiguous way to state it would be:
    Thanks for the trip down memory lane to my 10000 level philosophy class, though!

    I didn't get you wrong on anything. You stated that one of the advantages of MySQL over flat files is that it interfaces easily with a website to remote administer your data, whereas with a flat file, you have to connect to a server and edit them directly. Both of these statements are moot, as you can just as easily connect to a server and use the mysql command prompt to make data changes, as well as remotely administer flat file data through a website. It's all in how you choose to manage your data.

    I have no problem connecting to my server to change the configuration; I'm not sure what makes you think I do. Paid or not, developers that blatantly ignore best practice are, by definition, crap. It's good that you give users the option to use MySQL/SQLite in case they feel like unnecessarily chewing up system resources, or have some bizarre edge case that makes a database an appropriate choice. There's a big difference between merely "learning java" and "writing solid java code". Neither are difficult.

    Now for a hasty generalization: Java programmers are thin-skinned and quick to white knight what are undeniably bad coding practices. Someone is bound to completely miss the joke here.
     
  23. Offline

    Plague

    Well I did, what's the joke in this observation of reality? :p
     
  24. Offline

    hash

    I have a little bit of data to offer up to help put this entire discussion in a real, numerical context when it comes to how many fractions of a second somebody might be saving when trying to use a database on small data.

    Here's a couple of quick shell scripts you can do-it-yourself at home if you can run bash:

    # make a file that's exactly 10MB in size
    yes | head -c $(let "t=((1024*1024*10))"; echo $t) > disk1/junkfile
    # copy it and have the system time how long it takes
    time cp disk1/junkfile disk2/junkfile

    I just ran this on the computer I'm sitting at right now (both of the disks are ext4 filesystems and are 7200rpm spinners connected by SATA blah blah blah, very standard), and here's what I got: 0m0.011s.

    I want everyone who's ever thought a database is essential to performance for anything less than 10MB to stop and read this. 10MB takes about one hundredth of a second to move onto or off of a disk. 10MB, if we assume a config file has on average about 40 characters per line, is 262144 lines of data.

    More than two hundred thousand lines of data. In about one hundredth of a second.

    (100MB (aka about 2.6 million lines of config) gave me 0m0.689s, if anyone's curious. A gig got a little weighty: 0m18.095s. But then, that's getting up into the load range that databases were actually designed for.)
     
  25. Offline

    Greasy Digits

    Just that if you narrow your view of Java coders to the content of this thread, it's not be too hard to arrive at that conclusion.

    Good god can you imagine having to learn how to use every developer's unique and poorly designed interface for their data, security issues aside? I want to learn a unique interface for each of my server's plugins like I want an asshole on my forehead. If there's anything I know about programmers, it's that most of us blow at any sort of interface design anyway. I blow at interface design, but can identify a great interface. For the lucky ones that can do both, I salute you.
     
  26. Offline

    bradcland

    This thread was a good read.
     
  27. Offline

    Astrognome

    I'm still pretty new to java. (Like 2 days or so) and even I know when to use which. I agree with the original guy. You can just store in ram if you have 6 lines of data, but if you have 600 lines of data, it might be better to use a db.
     
  28. Offline

    hash

    Except multiply that by at least like 500. Seriously, 600 lines would strain an antique atari, but that's about it.
     
  29. Offline

    Plague

    Meh, I think I'll stop reading now. Some people just do not get how much overhead DB does. Yes, SQLite does the least of many systems, but t it still pretty big. But hey, screw those that do not listen or look at the code themselves, I write my own plugins anyway.
     
  30. Offline

    Astrognome

    Sorry about the 600 thing. It's still a low number. Still a bit new to all this.
     
Thread Status:
Not open for further replies.

Share This Page