UPDATE: while the issue initially proposed that we allow unlimited table name lengths, as the discussion continued below, we decided that it's much simpler just to allow longer - but not unlimited - names. It turned out that Cassandra already took that route, accidentally, and 8 years ago started to allow names longer than 48 characters, but still limited by the filesystem limitations of file name length. So we changed this issue to request that Scylla follows suite, and allow longer - but not unlimited - table names. (and keyspace names, view names, and index names, which all translate to path names).
This is a low-priority and esoteric feature proposal.
Following Cassandra's lead, Scylla has a limit on the length of table names - schema::NAME_LENGTH = 48. However, this limit is only superficially enforced by the CQL layer (create_table_statement), and if that test is removed or bypassed, table names can be allowed to be any length.
But today this doesn't quite work. There's an additional problem. When the existing code creates a directory to store the new table, this directory's name is created by keyspace::column_family_directory(), which takes the table's full name, a dash (-), and a 32-byte UUID string. Because most Linux filesystems limit filename components to 255 bytes, this means that any table name longer than 222 bytes will attempt to mkdir() a directory name longer than the allowed 255 bytes, and fail. Even worse, this mkdir() failure is considered a failed I/O, and causes Scylla to shut down thinking an unrecoverable I/O error has occured.
I think there is actually a very simple way to solve this problem and allow arbitrarily long table names:
First we should note that the table's name isn't really necessary in the directory name, because the directories are already guaranteed to be unique by virtue of that UUID tacked at the end. So we could have forgone with this table name altogether. However, changing keyspace::column_family_directory() in this way will break backward compatibility with already-existing tables (on boot, Scylla will not know where to pick up the individual tables' directories). Also, external scripts may find it useful that the directory name includes the table's familiar name and not only an obscure UUID.
So the simple solution is for keyspace::column_family_directory() to take the first schema::NAME_LENGTH (48) bytes of the name into the directory instead of the whole name.
The idea is that:
- Existing table names are already limited to 48 bytes so backward compatibility isn't broken.
- Not only the UUID is visible in the directory name, also the first 48 bytes of the table's name (which will almost always be useful).
- Even if the table's name has 1000 bytes, we don't try to create a directory name longer than 48+33 = 81 bytes.
UPDATE: while the issue initially proposed that we allow unlimited table name lengths, as the discussion continued below, we decided that it's much simpler just to allow longer - but not unlimited - names. It turned out that Cassandra already took that route, accidentally, and 8 years ago started to allow names longer than 48 characters, but still limited by the filesystem limitations of file name length. So we changed this issue to request that Scylla follows suite, and allow longer - but not unlimited - table names. (and keyspace names, view names, and index names, which all translate to path names).
This is a low-priority and esoteric feature proposal.
Following Cassandra's lead, Scylla has a limit on the length of table names -
schema::NAME_LENGTH = 48. However, this limit is only superficially enforced by the CQL layer (create_table_statement), and if that test is removed or bypassed, table names can be allowed to be any length.But today this doesn't quite work. There's an additional problem. When the existing code creates a directory to store the new table, this directory's name is created by
keyspace::column_family_directory(), which takes the table's full name, a dash (-), and a 32-byte UUID string. Because most Linux filesystems limit filename components to 255 bytes, this means that any table name longer than 222 bytes will attempt tomkdir()a directory name longer than the allowed 255 bytes, and fail. Even worse, thismkdir()failure is considered a failed I/O, and causes Scylla to shut down thinking an unrecoverable I/O error has occured.I think there is actually a very simple way to solve this problem and allow arbitrarily long table names:
First we should note that the table's name isn't really necessary in the directory name, because the directories are already guaranteed to be unique by virtue of that UUID tacked at the end. So we could have forgone with this table name altogether. However, changing
keyspace::column_family_directory()in this way will break backward compatibility with already-existing tables (on boot, Scylla will not know where to pick up the individual tables' directories). Also, external scripts may find it useful that the directory name includes the table's familiar name and not only an obscure UUID.So the simple solution is for
keyspace::column_family_directory()to take the firstschema::NAME_LENGTH(48) bytes of the name into the directory instead of the whole name.The idea is that: