An onto function is a function in which each element of set B has a pre-image i.e. codomain = range. To count number of onto functions, we can use inclusion-exclusion principle.
So total no. of functions from A to B are clearly kn, since each of the n elements in A has k choices of mapping. But this is overestimation of onto functions because some elements in B may be there which were not mapped by any of the n elements of A. So we subtract those functions in which exactly 1 element of B was not mapped, if this was the case, then each of the n element had only k-1 choices, so no. of functions of these kind are k*[(k-1)n]. Here k is multiplied because that one element which was left out in B can be any of the k elements of B. But this strategy subtracted those functions two times, which had two elements of B not mapped, so we have to add those functions one time in which exactly 2 elements of B were not mapped. So if 2 elements were not mapped, then each of the n elements of A had k-2 choices, so we have kC2*(k-2)n functions, here we have multiplied kC2 because those 2 left out elements can be chosen in kC2 ways. So our current count becomes kn - kC1*(k-1)n + kC2*(k-2)n.
We can continue like this and so we get
Now we have calculated no. of onto functions, we can go back to our original problem of stirling numbers of second kind. So that problem is exactly same as this onto problem, except that here we are distinguishing k subsets also, but in our original problem, k subsets were identical, so we divide whole thing we derived by k!, so our final formula for S(n,k) becomes
So for example S(6,4) becomes