Per-project users and groups (aka service groups)

In Wikimedia Labs, we’re using OpenStack with heavy LDAP integration. With this integration we have a concept of global groups and users. When a user registers with Labs, the user’s account immediately becomes a global user, usable in all of Labs and its related infrastructure. When the user is added to an OpenStack project it’s also added to a global group, which is usable throughout the infrastructure.

Global users and groups are really useful for handing authentication and authorization at a global level, especially when interacting with things like Gerrit and other global services. Global users can also be used as service accounts within instances, between instances or between projects. There’s a number of downsides to global users though:

  1. The global user creation process is laborious.
  2. Global users must provide an email address.
  3. Global users have a fixed home directory in /home (which is an autofs NFS or GlusterFS mount).
  4. Global users can authenticate to all services, even if that’s not necessary or wanted.
  5. Global users must have shell rights and must be added to a project to be properly usable for data access inside of a project.
  6. Users get confused when told to create Labs accounts to be used as service accounts.
  7. If multiple users want to access a global user, it’s necessary for the credentials to be shared, or to have a project admin create a sudo policy.
  8. Global users don’t have personal groups, but instead have global groups, which are projects. Limiting access to data for a global user is difficult.

It’s also possible to define system users and groups via puppet and have those applied to instances within a project. There’s one major downside to this though: the changes would need to go through review first, which bottlenecks the process to the Operations team, and muddies up the puppet repository.

With the introduction of the Tools project, as a Toolserver replacement, there was a really strong need for service users and groups that could be handled in a simple way. Our Tools project implementer, Marc-Andre Pelletier, had a novel concept: per-project users and groups.

The concept and LDAP implementation

Marc’s initial concept was to have another set of OUs, like ou=project-people and ou=project-groups, where we’d create sub-OUs per project. We adjusted the concept when I showed Marc how we’re currently handling per-project sudo, though. Rather than using separate top-level OUs, we further extended the directory information tree (DIT) used by the OpenStack projects. Our OU for projects is ou=projects,dc=wikimedia,dc=org and the tools project OU is ou=tools,ou=projects,dc=wikimedia,dc=org. The extension for service groups also uses the sudo extension, and here’s the basic structure:

dn: cn=tools,ou=projects,dc=wikimedia,dc=org
objectClass: groupofnames
objectClass: extensibleobject
objectClass: top
cn: tools
info: servicegrouphomedirpattern=/data/project/%u
member: <member-list-goes-here>

# Single role, allowed to manage all project info
dn: cn=projectadmin,cn=tools,ou=projects,dc=wikimedia,dc=org
cn: projectadmin
roleOccupant: <member-list-goes-here>
objectClass: organizationalrole
objectClass: top

dn: ou=sudoers,cn=tools,ou=projects,dc=wikimedia,dc=org
ou: sudoers
objectClass: organizationalunit
objectClass: top

dn: ou=groups,cn=tools,ou=projects,dc=wikimedia,dc=org
ou: groups
objectClass: organizationalunit
objectClass: top

dn: ou=people,cn=tools,ou=projects,dc=wikimedia,dc=org
ou: people
objectClass: organizationalunit
objectClass: top

Note: for the project definition we have a pretty ugly hack. We’re sticking configuration information into the OU using the info attribute and extensibleobject. We’re doing this so that both cli tools and the web interface can know how to handle service groups. We should define our own schema and extend the object properly, but this was a quick and dirty hack for handling it to meet a hackathon deadline.

Allowing the addition of service groups is easy enough. When a service group is requested, a user and group are added to ou=people and ou=groups respectively. However, this doesn’t make the service groups easily accessible. To solve this, we automatically add a project sudo policy for every service group. Here’s a single service group added to the DIT:

dn: cn=local-example,ou=groups,cn=tools,ou=projects,dc=wikimedia,dc=org
objectClass: groupofnames
objectClass: posixgroup
objectClass: top
gidNumber: 100000
member: <member-list-goes-here>
cn: local-example

dn: uid=local-example,ou=people,cn=tools,ou=projects,dc=wikimedia,dc=org
objectClass: person
objectClass: shadowaccount
objectClass: posixaccount
objectClass: top
uid: local-example
cn: local-example
sn: local-example
loginShell: /usr/local/bin/sillyshell
homeDirectory: /data/project/example/
uidNumber: 100000
gidNumber: 100000

dn: cn=runas-local-example,ou=sudoers,cn=tools,ou=projects,dc=wikimedia,dc=org
sudoOption: !authenticate
objectClass: sudorole
objectClass: top
sudoRunAsUser: local-example
sudoCommand: ALL
sudoUser: %local-example
cn: runas-local-example
sudoHost: ALL

Project members can create service groups. The project member that creates the service group is the initial service group member. Members of the service group are allowed to add other members. All service group members can sudo to the service group without authentication.

The user and group have the same name, and the same uid/gid number. The uid and gid range were meant to be reserved and be allowed to overlap with service groups in other projects. Similarly, the user and group names were prefixed with local- and were to be allowed to overlap with service groups in other projects. The uid/gid range and non-unique user/group names design has changed since initial implementation, but I’ll get into that later.

The instance implementation

We’re using nslcd, which has support for defining multiple OUs for available services (passwd, shadow, group, etc.). To make service groups available on instances in a project, we use a similar approach as per-project sudo. Puppet has a variable defined for the OpenStack project, which is then used when adding the service group specific OUs to nslcd.conf:

# root
base dc=wikimedia,dc=org

# Global users and groups
base passwd ou=people,dc=wikimedia,dc=org
base shadow ou=people,dc=wikimedia,dc=org
base group ou=groups,dc=wikimedia,dc=org

# Per-project users and groups (service groups)
base passwd ou=people,cn=tools,ou=projects,dc=wikimedia,dc=org
base shadow ou=people,cn=tools,ou=projects,dc=wikimedia,dc=org
base group ou=groups,cn=tools,ou=projects,dc=wikimedia,dc=org

That there’s a pretty massive gotcha here: if any of the OUs don’t exist for a service, it breaks the entire service. It’s very important that the OUs exist for all projects in which this will be used.

After-implementation changes and further changes to come

We added service groups prior to the tools project ramp-up, which was in April 2013. Since then we’ve made some modifications from the original design.

The first issue we ran into was with non-globally-unique uids and gids. Users can have a large number of groups, due to multiple project membership and service group membership. We’re providing per-project NFS shares, and NFS has a 16 group limitation defined in its RFC. A way around this is to use the –manage-gids option in rpc.mountd. With manage-gids, the NFS server does the secondary group lookup and ignores whatever the client sends to it. Here’s where the problem with non-unique uids and gids comes in. The server needs to know about all groups in all projects. We sync group info from LDAP into the NFS server’s local groups file, and we can name the groups whatever we want, but for multi-tenancy purposes the uids and gids need to be unique. Also, in general, making the uids and gids unique makes it easier to integrate other services that aren’t natively multi-tenant.

The second issue is with non-unique service group names. It’s easier to integrate service groups into services that aren’t natively multi-tenant if the service group names are unique as well. We’ll be changing the naming scheme from being prefixed with local- to being prefixed with <projectname>- or some variant of that. We have a specific use case in mind, using Gerrit, for this change, but I’ll write a follow-up post about that.