In [141]: L = nn.Linear(10,10)
...: S = nn.Sequential()
...: S.add_module('a', L)
...: S.add_module('b', L)
...: len(list(S.parameters()))
...:
Out[141]: 4
Listing shared parameters multiple times seems wrong... because for example SGD will end up accumulating the gradients twice (I think).
Also from a theoretical point of view I would say this model only has 110 parameters, not 220.