Skip to main content

Dividing n elements into groups of at least size g minimizing the size of each group

At work there's a little project to break up groups of employees into random groups (size 4) and set up Zoom calls between them. We're doing this to replace the serendipitous meetings that sometimes occur around coffee machines, in lunch lines or while waiting for the printer. And also, we just want people to get to know each other.

Which lead to me writing some code. The core of which is 'divide n elements into groups of at least size g minimizing the size of each group'. So, suppose an office has 15 employees in it then it would be divided into three groups of sizes 5, 5, 5; if an office had 16 employees it would be 4, 4, 4, 4; if it had 17 employees it would be 4, 4, 4, 5 and so on.

I initially wrote the following code (in Python):

    groups = [g] * (n//g)

    for e in range(0, n % g):
        groups[e % len(groups)] += 1

The first line creates n//g (// is integer division) entries of size g (for example, if g == 4 and n == 17 then groups == [4, 4, 4, 4]). The for loop deals with the 'left over' parts that don't divide exactly into groups of size g. If g == 4 and n == 17 then there will be one left over element to add to one of the existing [4, 4, 4, 4] groups resulting in [5, 4, 4, 4].

The e % len(groups) is needed because it's possible that there are more elements left over after dividing into equal sized groups than there are entries in groups. For example, if g == 4 and n == 11 then groups is initially set to [4, 4] with three left over elements that have to be distributed into just two entries in groups.

So, that code works and here's the output for various sizes of n (and g == 4):

    4 [4]
    5 [5]
    6 [6]
    7 [7]
    8 [4, 4]
    9 [5, 4]
    10 [5, 5]
    11 [6, 5]
    12 [4, 4, 4]
    13 [5, 4, 4]
    14 [5, 5, 4]
    15 [5, 5, 5]
    16 [4, 4, 4, 4]
    17 [5, 4, 4, 4]

But the code irritated me because I felt there must be a simple formula to work out how many elements should be in each group. After noodling on this problem I decided to do something that's often helpful... make the problem simple and naive, or at least, the solution simple and naive and so I wrote this code:

    groups = [0] * (n//g)

    for e in range(n):
        groups[e % len(groups)] += 1

This is a really simple implementation. I don't like it because it loops n times but it helps visualize something. Imagine that g == 4 and n == 17. This loop 'fills up' each entry in groups like this (each square is an entry in groups and numbers in the square are values of e for which that entry was incremented by the loop).

So groups ends up being [5, 4, 4, 4].  What this helps see is that the number of times groups[i] is incremented depends on the number of times the for loop 'loops around' on the ith element. And that's something that's easy to calculate without looping.

So this means that the code is now simply:

    groups = [1+max(0,n-(i+1))//(n//g) for i in range(n//g)]

And, to me at least, that is more satisfying. n//g is the size of groups which makes the loop update each entry in groups once. Each entry is set to 1 + max(0, n-(i+1))//(n//g). You can think of this as follows:

1. The 1 is the first element to be dropped into each entry in groups.

2. max(0, n-(i+1)) is 'the number of elements left over once you've placed 1 in each of the elements of groups up to position i'. It's divided by n//g to work out how many times the process of sharing out elements (see the naive loop above) will loop around.

If #2 there isn't clear consider the image above and imagine we are computing groups[0] (n == 17 and g == 4). We place 1 in groups[0] leaving 16 elements to share out. If you naively shared them out you'd loop around four times and thus need to add 16/4 elements to groups[0].

Move on to groups[1] and place a 1 in it. Now there are 15 elements to share out, that's 15/4 (which is 3 in integer division) and so you place 4 in groups[1]. And so on...

And that solution pleases me most. It succinctly creates groups in one shot. Of course, I might have over thought this... and others might think the other solutions are clearer or more maintainable.






 

Comments

Popular posts from this blog

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it: Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name. There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly. What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message: Our system is unable to process last names that contain non-letters, please replace them with spaces. Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your

How to write a successful blog post

First, a quick clarification of 'successful'. In this instance, I mean a blog post that receives a large number of page views. For my, little blog the most successful post ever got almost 57,000 page views. Not a lot by some other standards, but I was pretty happy about it. Looking at the top 10 blog posts (by page views) on my site, I've tried to distill some wisdom about what made them successful. Your blog posting mileage may vary. 1. Avoid using the passive voice The Microsoft Word grammar checker has probably been telling you this for years, but the passive voice excludes the people involved in your blog post. And that includes you, the author, and the reader. By using personal pronouns like I, you and we, you will include the reader in your blog post. When I first started this blog I avoid using "I" because I thought I was being narcissistic. But we all like to read about other people, people help anchor a story in reality. Without people your bl

The Elevator Button Problem

User interface design is hard. It's hard because people perceive apparently simple things very differently. For example, take a look at this interface to an elevator: From flickr Now imagine the following situation. You are on the third floor of this building and you wish to go to the tenth. The elevator is on the fifth floor and there's an indicator that tells you where it is. Which button do you press? Most people probably say: "press up" since they want to go up. Not long ago I watched someone do the opposite and questioned them about their behavior. They said: "well the elevator is on the fifth floor and I am on the third, so I want it to come down to me". Much can be learnt about the design of user interfaces by considering this, apparently, simple interface. If you think about the elevator button problem you'll find that something so simple has hidden depths. How do people learn about elevator calling? What's the right amount of